Methods for evaluating oligonucleotide probe sequences

ABSTRACT

Methods are disclosed for predicting the potential of an oligonucleotide to hybridize to a target nucleotide sequence. A predetermined number of unique oligonucleotides is identified. The unique oligonucleotides are chosen to sample the entire length of a nucleotide sequence that is hybridizable with the target nucleotide sequence. At least one parameter that is independently predictive of the ability of each of the oligonucleotides of the set to hybridize to the target nucleotide sequence is determined and evaluated for each of the above oligonucleotides. A subset of oligonucleotides within the predetermined number of unique oligonucleotides is identified based on the evaluation of the parameter. Oligonucleotides in the subset are identified that are clustered along a region of the nucleotide sequence that is hybridizable to the target nucleotide sequence. The method may be carried out with the aid of a computer.

APPENDIX

[0001] This patent application includes an appendix (the “Appendix”),which contains the source code for the software used in carrying out theexamples in accordance with the present invention.

[0002] A portion of the present disclosure contains material that issubject to copyright protection. The copyright owner has no objection tothe facsimile reproduction by anyone of the patent document or thepatent disclosure as it appears in the U.S. Patent and Trademark Officepatent files or records, but otherwise reserves all copyright rightswhatsoever.

BACKGROUND OF THE INVENTION

[0003] 1. Field of the Invention

[0004] Significant morbidity and mortality are associated withinfectious diseases and genetically inherited disorders. More rapid andaccurate diagnostic methods are required for better monitoring andtreatment of these conditions. Molecular methods using DNA probes,nucleic acid hybridization and in vitro amplification techniques arepromising methods offering advantages to conventional methods used forpatient diagnoses.

[0005] Nucleic acid hybridization has been employed for investigatingthe identity and establishing the presence of nucleic acids.Hybridization is based on complementary base pairing. When complementarysingle stranded nucleic acids are incubated together, the complementarybase sequences pair to form double-stranded hybrid molecules. Theability of single stranded deoxyribonucleic acid (ssDNA) or ribonucleicacid (RNA) to form a hydrogen bonded structure with a complementarynucleic acid sequence has been employed as an analytical tool inmolecular biology research. The availability of radioactive nucleosidetriphosphates of high specific activity and the development of methodsfor their incorporation into DNA and RNA has made it possible toidentify, isolate, and characterize various nucleic acid sequences ofbiological interest. Nucleic acid hybridization has great potential indiagnosing disease states associated with unique nucleic acid sequences.These unique nucleic acid sequences may result from genetic orenvironmental change in DNA by insertions, deletions, point mutations,or by acquiring foreign DNA or RNA by means of infection by bacteria,molds, fungi, and viruses. The application of nucleic acid hybridizationas a diagnostic tool in clinical medicine is limited due to the cost andeffort associated with the development of sufficiently sensitive andspecific methods for detecting potentially low concentrations ofdisease-related DNA or RNA present in the complex mixture of nucleicacid sequences found in patient samples.

[0006] One method for detecting specific nucleic acid sequencesgenerally involves immobilization of the target nucleic acid on a solidsupport such as nitrocellulose paper, cellulose paper, diazotized paper,or a nylon membrane. After the target nucleic acid is fixed on thesupport, the support is contacted with a suitably labeled probe nucleicacid for about two to forty-eight hours. After the above time period,the solid support is washed several times at a controlled temperature toremove unhybridized probe. The support is then dried and the hybridizedmaterial is detected by autoradiography or by spectrometric methods.When very low concentrations must be detected, the above method is slowand labor intensive, and nonisotopic labels that are less readilydetected than radiolabels are frequently not suitable.

[0007] A method for the enzymatic amplification of specific segments ofDNA known as the polymerase chain reaction (PCR) method has beendescribed. This in vitro amplification procedure is based on repeatedcycles of denaturation, oligonucleotide primer annealing, and primerextension by thermophilic polymerase, resulting in the exponentialincrease in copies of the region flanked by the primers. The PCRprimers, which anneal to opposite strands of the DNA, are positioned sothat the polymerase catalyzed extension product of one primer can serveas a template strand for the other, leading to the accumulation of adiscrete fragment whose length is defined by the distance between the 5′ends of the oligonucleotide primers.

[0008] Other methods for amplifying nucleic acids have also beendeveloped. These methods include single primer amplification, ligasechain reaction (LCR), transcription-mediated amplification methodsincluding 3SR and NASBA, and the Q-beta-replicase method. Regardless ofthe amplification used, the amplified product must be detected.

[0009] One method for detecting nucleic acids is to employ nucleic acidprobes that have sequences complementary to sequences in the targetnucleic acid. A nucleic acid probe may be, or may be capable of being,labeled with a reporter group or may be, or may be capable of becoming,bound to a support. Detection of signal depends upon the nature of thelabel or reporter group. Usually, the probe is comprised of naturalnucleotides such as ribonucleotides and deoxyribonucleotides and theirderivatives although unnatural nucleotide mimetics such as peptidenucleic acids and oligomeric nucleoside phosphonates are also used.Commonly, binding of the probes to the target is detected by means of alabel incorporated into the probe. Alternatively, the probe may beunlabeled and the target nucleic acid labeled. Binding can be detectedby separating the bound probe or target from the free probe or targetand detecting the label. In one approach, a sandwich is formed comprisedof one probe, which may be labeled, the target and a probe that is orcan become bound to a surface. Alternatively, binding can be detected bya change in the signal-producing properties of the label upon binding,such as a change in the emission efficiency of a fluorescent orchemiluminescent label. This permits detection to be carried out withouta separation step. Finally, binding can be detected by labeling thetarget, allowing the target to hybridize to a surface-bound probe,washing away the unbound target and detecting the labeled target thatremains.

[0010] Direct detection of labeled target hybridized to surface-boundprobes is particularly advantageous if the surface contains a mosaic ofdifferent probes that are individually localized to discrete, knownareas of the surface. Such ordered arrays containing a large number ofoligonucleotide probes have been developed as tools for high throughputanalyses of genotype and gene expression. Oligonucleotides synthesizedon a solid support recognize uniquely complementary nucleic acids byhybridization, and arrays can be designed to define specific targetsequences, analyze gene expression patterns or identify specific allelicvariations. One difficulty in the design of oligonucleotide arrays isthat oligonucleotides targeted to different regions of the same gene canshow large differences in hybridization efficiency, presumably due, atleast in part, to the interplay between the secondary structures of theoligonucleotides and their targets and the stability of the finalprobe/target hybridization product. A method for predicting whicholigonucleotides will show detectable hybridization would substantiallydecrease the number of iterations required for optimal array design andwould be particularly useful when the total number of oligonucleotideprobes on the array is limited. A method to predict oligonucleotidehybridization efficiency would also streamline the empirical approachescurrently used to select potential antisense therapeutics, which aredesigned to modulate gene expression in vivo by hybridizing to specificmessenger RNA (mRNA) molecules and inhibiting their translation intoproteins.

[0011] While it is well known that the structure of the target nucleicacid affects the affinity of oligonucleotide hybridization, currentmethods for predicting target structures from the primary sequence failto predict target regions accessible for oligonucleotide binding.Consequently, selection of oligonucleotides for antisense reagents oroligonucleotide probe arrays has been largely empirical. As most of thetarget sequence is sequestered by intramolecular base pairing and notaccessible for oligonucleotide binding, the process of identifying goodoligonucleotides has required large numbers of low efficiencyexperiments.

[0012] The design and implementation of algorithms that effectivelypredict the ability of oligonucleotides to rapidly and avidly bind tocomplementary nucleotide sequences has been an important problem inmolecular biology since the invention of facile methods for chemical DNAsynthesis. The subsequent inventions of the polymerase chain reaction(PCR), antisense inhibition of gene expression and oligonucleotide arraymethods for performing massively parallel hybridization experiments havemade the need for effective predictive algorithms even more critical.

[0013] Previous attempts to solve the nucleic acid probe design probleminclude PCR primer design software applications (e.g., OLIGO®), neuralnetworks, PCR primer design applications that search for sequences thatpossess minimal ability to cross-hybridize with other targets present ina sample (e.g., HYBsimulator™), and approaches that attempt to predictthe efficiency of antisense sequence suppression of mRNA translationfrom a combination of predicted nucleic acid duplex melting temperatureand predicted target strand structure. The methods that predicteffective oligonucleotide primers for performing PCR from DNA templateswork well for that application where relatively stringent conditions areemployed. This is because PCR experimental design greatly simplifies theprediction problem: hybridization is performed at high temperature, atrelatively low ionic strength and in the presence of a large molarexcess of oligonucleotide. Under these conditions, the oligonucleotideand target secondary structures are relatively unimportant.

[0014] Unfortunately, these conditions do not apply to oligonucleotidearrays, which are usually hybridized under relatively non-denaturingconditions, or to antisense suppression of gene expression, which takesplace in vivo. Oligonucleotide arrays can contain hundreds of thousandsof different sequences and conditions are chosen to allow theoligonucleotide with the lowest melting temperature to hybridizeefficiently. These “lowest common denominator” conditions are usuallyrelatively non-denaturing and secondary structure constraints becomesignificant. Accordingly, the above applications require new predictivemethods that are capable of estimating the effects of oligonucleotideand target structure on hybridization efficiency. For these reasons,current algorithms for designing PCR primer oligonucleotides fail badlywhen applied to the problems of oligonucleotide array or antisenseoligonucleotide design.

[0015] To date, the most effective approach for identifyingoligonucleotides with good hybridization efficiency has been anempirical one. Such an approach involves the synthesis of large numbersof oligonucleotide probes for a given target nucleotide sequence. Arraysare formed that include the above oligonucleotide probes. Hybridizationexperiments are carried out to determine which of the oligonucleotideprobes exhibit good hybridization efficiencies. Examples of such anapproach are found in D. Lockhart, et al., Nature Biotech., infra, L.Wodicka, et al., Nature Biotechnology, infra., and N. Milner et al.Nature Biotech, infra. One major drawback to this approach is the vastnumber of oligonucleotides that must be synthesized in order to achievea satisfactory result. Typically, about 2%-5% of the test probessynthesized yield acceptable signal levels.

[0016] The use of neural networks for oligonucleotide design has alsobeen investigated. Neural networks are easily taught with real data;they therefore afford a general approach to many problems. However,their performance is limited by the “senses” that they are given. Ananalogy works best here: the human brain is an astoundingly capableneural network, but a blind person cannot be taught to reliablydistinguish colors by smell. In addition, a large amount of data isrequired to adequately teach a neural network to perform its job well. Acomprehensive database for either oligonucleotide array design orantisense suppression of gene expression has not been made available.For these reasons, the performance reported to-date of neural networksolutions against the probe design problem is mediocre.

[0017] Finally, approaches that have attempted to use target nucleicacid folding calculations to predict experimental results inferred todepend upon hybridization efficiency (e.g. antisense suppression of mRNAtranslation) have so far only demonstrated that the predictions ofcurrent nucleic acid folding calculations correlate poorly with observedbehavior. The probable reason for this is that the structures predictedby such programs for long sequences are poor predictors of chemicalreality; the results of experiments that attempt to confirm thepredictions of such calculations support this assessment. Recentimprovements to this approach which use predicted RNA structure topologyas a predictor of relative RNA/RNA association kinetics have been moresuccessful at forecasting the results of antisense experiments. However,these methods are not computationally efficient, and have so far onlybeen shown to work for targets less than 100 bases long. Such methodsare therefore not yet capable of predicting the behavior of full-lengthmRNA targets, which are typically between 1,000 and 2,000 bases inlength.

[0018] 2. Description of the Related Art

[0019] U.S. Pat. No. 5,512,438 (Ecker) discloses the inhibition of RNAexpression by forming a pseudo-half knot RNA at the target's RNAsecondary structure using antisense oligonucleotides.

[0020] Cook, et al., in U.S. Pat. No. 5,670,633 discuss sugar-modifiedoligonucleotides that detect and modulate gene expression.

[0021] Antisense oligonucleotide inhibition of the RAS gene is disclosedin U.S. Pat. No. 5,582,986 (Monia, et al.).

[0022] U.S. Pat. No. 5,593,834 (Lane, et al.) discusses a method ofpreparing DNA sequences with known ligand binding characteristics.

[0023] Mitsuhashi, et al., in U.S. Pat. No. 5,556,749 discusses acomputerized method for designing optimal DNA probes and anoligonucleotide probe design station.

[0024] U.S. Pat. No. 5,081,584 (Omichinski, et al.) discloses acomputer-assisted design of anti-peptides based on the amino acidsequence of a target peptide.

[0025] A PCR primer design application that searches for sequences thatpossess minimal ability to cross-hybridize with other targets present ina sample is available as HYBsimulator™, version 2.0, AGCT, Inc., 2102Business Center Drive, Suite 170, Irvine, Calif. 92715 (714) 833-9983.

[0026] A PCR primer design software application is available as OLIGO®,version 5.0, National Biosciences, Inc., 3650 Annapolis Lane North,#140, Plymouth, Minn. 55447 (800) 747-4362.

[0027] D. J. Lockhart, et al., Nature Biotech. 14:1675-1684 (1996)describe a neural network approach to the selection of efficientsurface-bound oligonucleotide probes.

[0028] M. Mitsuhashi, et al., Nature, 367:759-761 (1994) disclose amethod for designing specific oligonucleotide probes and primers bymodeling the potential cross-hybridization of candidate probes tonon-target sequences known to be present in samples.

[0029] R. A. Stull, et al., Nuc. Acids Res., 20:3501-3508 (1992)describe a method of predicting the efficacy of antisenseoligonucleotides, using predicted target secondary structure andpredicted oligonucleotide/target binding free energy as inputparameters.

[0030] N. Milner, et al., Nature Biotechnology, 15:537-541 (1997)compare observed patterns of probe hybridization to those expected fromthe predicted secondary structure of the nucleic acid target.

[0031] L. Wodicka, et al., Nature Biotechnology, 15:1359-1367 (1997)describe simple rules for avoiding inefficient and non-specific probesduring design and synthesis of oligonucleotides arrays.

[0032] J. SantaLucia Jr., et al., Biochemistry, 35:3555 (1996) discloseparameters and methods for the calculation of thermodynamic propertiesof DNA/DNA homoduplexes.

[0033] N. Sugimoto, et al., Biochemistry, 34:11211 (1995) discloseparameters and methods for the calculation of thermodynamic propertiesof DNA/RNA heteroduplexes.

[0034] J. A. Jaeger, et al., Proc. Nati. Acad. Sci. USA, 86:7706 (1989)disclose methods for estimation of the free energy of the most stableintramolecular structure of a single-stranded polynucleotide, by meansof a dynamic programming algorithm.

[0035] S. F. Altschul, et al., Nature Genetics, 6:119-129 (1994)disclose methods for calculating the complexity and information contentof amino acid and nucleic acid sequences.

[0036] T. A. Weber and E. Helfand, J. Chem. Phys., 71, 4760 (1979)describe approaches for the modeling of polymer structures by moleculardynamics simulations.

[0037] V. Patzel and G. Sczakiel, Nature Biotech.,.16, 64-68 (1998)disclose methods for estimating rate constants for association ofantisense RNA molecules with mRNA targets by examination of predictedantisense RNA secondary structures.

[0038] Light-generated oligonucleotide arrays for rapid DNA sequenceanalysis is described by A. C. Pease, et al., Proc. Nat. Acad. Sci. USA(1994) 91:5022-5026.

[0039] Mitsuhashi discusses basic requirements for designing optimaloligonucleotide probe sequences in J. Clinical Laboratory Analysis(1996) 10:277-284.

[0040] Rychlik, et al., discloses a computer program for choosingoptimal oligonucleotides for filter hybridization, sequencing and invitro amplification of DNA in Nucleic Acids Research (1989)17(21):8543-8551.

[0041] A strategy for designing specific antisense oligonucleotidesequences is described by Mitsuhashi in J. Gastroenterol. (1997)32:282-287.

[0042] Mitsuhashi discusses basic requirements for designing optimal PCRprimers in J. Clinical Laboratory Analysis (1996) 10:285-293.

[0043] Hyndman, et al., disclose software to determine optimaloligonucleotide sequences based on hybridization simulation data inBioTechniques (1996) 20(6):1090-1094.

[0044] Eberhardt discloses a shell program for the design of PCR primersusing genetics computer group (GCG) software (7.1) on VAX/VSM™ systemsin BioTechniques (1992) 13(6):914-917.

[0045] Chen, et al., disclose a computer program for calculating themelting temperature of degenerate oligonucleotides used in PCR orhybridization in BioTechniques (1997) 22(6): 1158-1160.

[0046] Partial thermodynamic parameters for prediction stability andwashing behavior of DNA duplexes immobilized on gel matrix is describedby Kunitsyn, et al., in J. Biomolecular Structure & Dynamics, ISSN0739-1102 (1996) 14(1):239-244.

SUMMARY OF THE INVENTION

[0047] One embodiment of the present invention is a method forpredicting the potential of an oligonucleotide to hybridize to a targetnucleotide sequence. A predetermined set of unique oligonucleotidesequences is identified. The unique oligonucleotide sequences are chosento sample the entire length of a nucleotide sequence that ishybridizable with the target nucleotide sequence. At least one parameterthat is predictive of the ability of each of the oligonucleotidesspecified by the set of sequences to hybridize to the target nucleotidesequence is determined and evaluated for each of the aboveoligonucleotide sequences. A subset of oligonucleotide sequences withinthe predetermined set of unique oligonucleotide sequences is identifiedbased on the examination of the parameter values. Finally,oligonucleotide sequences in the subset are identified that areclustered along one or more regions of the nucleotide sequence that ishybridizable to the target nucleotide sequence. The oligonucleotideprobes corresponding to the identified sequences find use inpolynucleotide assays particularly where the assays involveoligonucleotide arrays. For a discussion of oligonucleotide arrays, see,e.g., U.S. Pat. No. 5,700,637 (E. Southern) and U.S. Pat. No. 5,667,667(E. Southern), the relevant disclosures of which are incorporated hereinby reference.

[0048] Another embodiment of the present invention is a method forpredicting the potential of an oligonucleotide to hybridize to acomplementary target nucleotide sequence. A set of overlappingoligonucleotide sequences is identified based on a nucleotide sequencethat is complementary to the target nucleotide sequence. At least twoparameters that are independently predictive of the ability of each ofthe oligonucleotides specified by the oligonucleotide sequences tohybridize to the target nucleotide sequence are determined and evaluatedfor each of the oligonucleotide sequences. Independence is assured byrequiring that the parameters be poorly correlated with respect to oneanother. A subset of oligonucleotide sequences within the set ofoligonucleotide sequences is identified based on the examination of theparameter values. Finally, oligonucleotide sequences in the subset areidentified that are clustered along one or more regions of thenucleotide sequence that is complementary to the target nucleotidesequence.

[0049] Another embodiment of the present invention is a method forpredicting the potential of an oligonucleotide to hybridize to acomplementary target nucleotide sequence. A set of overlappingoligonucleotide sequences is obtained based on a nucleotide sequence oflength L, complementary to the target nucleotide sequence. Theoligonucleotide sequences of the set of overlapping oligonucleotidesequences are of identical length N and spaced one nucleotide apart. Theset comprises L−N+1 oligonucleotide sequences. Parameters are determinedfor each of the oligonucleotide sequences of the set of overlappingoligonucleotide sequences. One parameter is the predicted meltingtemperature of the duplex of each of the oligonucleotides specified bythe oligonucleotide sequences and the target nucleotide sequence,corrected for salt concentration. The other parameter is the predictedfree energy of the most stable intramolecular structure of each of theoligonucleotides specified by the oligonucleotide sequences at thetemperature of hybridization of the oligonucleotide with the targetnucleotide sequence. A subset of oligonucleotide sequences within theset of oligonucleotide sequences is selected based on an examination ofthe parameter values by establishing cut-off values for each of theparameters. Oligonucleotide sequences in the subset that are clusteredalong one or more regions of the complementary nucleotide sequence areranked based on the sizes of the clusters of oligonucleotide sequences.Finally, a subset of the clustered oligonucleotide sequences is selectedthat statistically samples the clusters of oligonucleotide sequences.The selected sampled subset is used to specify the synthesis ofoligonucleotides for experimental evaluation.

[0050] Another aspect of the present invention is a computer basedmethod for predicting the potential of an oligonucleotide to hybridizeto a target nucleotide sequence. A predetermined number of uniqueoligonucleotides within a nucleotide sequence that is hybridizable withthe target nucleotide sequence is identified under computer control. Theoligonucleotides are chosen to sample the entire length of thenucleotide sequence. A value is determined and evaluated under computercontrol for each of the oligonucleotides for at least one parameter thatis independently predictive of the ability of each of theoligonucleotides to hybridize to the target nucleotide sequence. Theparameter values are stored. A subset of oligonucleotides within thepredetermined number of unique oligonucleotides is identified byexamination of the stored parameter values under computer control. Then,oligonucleotides in the subset that are clustered along a region of thenucleotide sequence that is hybridizable to the target nucleotidesequence are identified under computer control.

[0051] Another aspect of the present invention is a computer system forconducting a method for predicting the potential of an oligonucleotideto hybridize to a target nucleotide sequence. The system comprises (a)input means for introducing a target nucleotide sequence into thecomputer system, (b) means for determining a number of uniqueoligonucleotide sequences that are within a nucleotide sequence that ishybridizable with the target nucleotide sequence where theoligonucleotide sequences are chosen to sample the entire length of thenucleotide sequence, (c) memory means for storing the oligonucleotidesequences, (d) means for controlling the computer system to carry outfor each of the oligonucleotide sequences a determination and evaluationof a value for at least one parameter that is independently predictiveof the ability of each of the oligonucleotide sequences to hybridize tothe target nucleotide sequence, (e) means for storing the parametervalues, (f) means for controlling the computer to carry out anidentification from the stored parameter values a subset ofoligonucleotide sequences within the number of unique oligonucleotidesequences based on the examination of the parameter, (g) means forstoring the subset of oligonucleotides, (h) means for controlling thecomputer to carry out an identification of oligonucleotide sequences inthe subset that are clustered along a region of the nucleotide sequencethat is hybridizable to the target nucleotide sequence, (i) means forstoring the oligonucleotide sequences in the subset, and (j) means foroutputting data relating to the oligonucleotide sequences in the subset.

BRIEF DESCRIPTION OF THE DRAWINGS

[0052]FIG. 1 is a general flow chart depicting the method of the presentinvention.

[0053]FIG. 2 is a flow chart depicting a preferred embodiment of amethod in accordance with the present invention.

[0054]FIG. 3 is a contour plot of normalized hybridization intensityfrom multiple experiments, as a function of the free energy of the moststable probe intramolecular structure (ΔG_(MFOLD)) and the differencebetween the predicted RNA/DNA heteroduplex melting temperature (T_(m))and the temperature of hybridization (T_(hyb)).

[0055]FIG. 4 shows the observed hybridization patterns foroligonucleotides selected using a method in accordance with the presentinvention and additional oligonucleotides to a portion of the rabbitβ-globin gene (radiolabeled antisense RNA target).

[0056]FIG. 5 shows the observed hybridization patterns foroligonucleotides selected using a method in accordance with the presentinvention and additional oligonucleotides to the HIV PRT gene(fluorescein-labeled sense RNA target).

[0057]FIG. 6 shows the observed hybridization patterns foroligonucleotides selected using a method in accordance with the presentinvention and additional oligonucleotides to the G3PDH gene(fluorescein-labeled antisense RNA target).

[0058]FIG. 7 shows the observed hybridization patterns foroligonucleotides selected using a method in accordance with the presentinvention and additional oligonucleotides to the p53 gene(fluorescein-labeled antisense RNA target).

[0059]FIG. 8 shows the observed hybridization patterns foroligonucleotides selected using a method in accordance with the presentinvention and additional oligonucleotides to the HIV PRTs gene (usingdata from the GeneChip™ data).

DEFINITIONS

[0060] Before proceeding further with a description of the specificembodiments of the present invention, a number of terms will be defined.

[0061] Nucleic Acids:

[0062] Polynucleotide—a compound or composition that is a polymericnucleotide or nucleic acid polymer. The polynucleotide may be a naturalcompound or a synthetic compound. In the context of an assay, thepolynucleotide is often referred to as a polynucleotide analyte. Thepolynucleotide can have from about 20 to 5,000,000 or more nucleotides.The larger polynucleotides are generally found in the natural state. Inan isolated state the polynucleotide can have about 30 to 50,000 or morenucleotides, usually about 100 to 20,000 nucleotides, more frequently500 to 10,000 nucleotides. It is thus obvious that isolation of apolynucleotide from the natural state often results in fragmentation.The polynucleotides include nucleic acids, and fragments thereof, fromany source in purified or unpurified form including DNA (dsDNA andssDNA) and RNA, including tRNA, mRNA, rRNA, mitochondrial DNA and RNA,chloroplast DNA and RNA, DNA/RNA hybrids, or mixtures thereof, genes,chromosomes, plasmids, the genomes of biological material such asmicroorganisms, e.g., bacteria, yeasts, viruses, viroids, molds, fungi,plants, animals, humans, and the like. The polynucleotide can be only aminor fraction of a complex mixture such as a biological sample. Alsoincluded are genes, such as hemoglobin gene for sickle-cell anemia,cystic fibrosis gene, oncogenes, cDNA, and the like.

[0063] The polynucleotide can be obtained from various biologicalmaterials by procedures well known in the art. The polynucleotide, whereappropriate, may be cleaved to obtain a fragment that contains a targetnucleotide sequence, for example, by shearing or by treatment with arestriction endonuclease or other site specific chemical cleavagemethod.

[0064] For purposes of this invention, the polynucleotide, or a cleavedfragment obtained from the polynucleotide, will usually be at leastpartially denatured or single stranded or treated to render it denaturedor single stranded. Such treatments are well known in the art andinclude, for instance, heat or alkali treatment, or enzymatic digestionof one strand. For example, dsDNA can be heated at 90-100° C. for aperiod of about 1 to 10 minutes to produce denatured material.

[0065] Target nucleotide sequence—a sequence of nucleotides to beidentified, usually existing within a portion or all of apolynucleotide, usually a polynucleotide analyte. The identity of thetarget nucleotide sequence generally is known to an extent sufficient toallow preparation of various sequences hybridizable with the targetnucleotide sequence and of oligonucleotides, such as probes and primers,and other molecules necessary for conducting methods in accordance withthe present invention, an amplification of the target polynucleotide,and so forth.

[0066] The target sequence usually contains from about 30 to 5,000 ormore nucleotides, preferably 50 to 1,000 nucleotides. The targetnucleotide sequence is generally a fraction of a larger molecule or itmay be substantially the entire molecule such as a polynucleotide asdescribed above. The minimum number of nucleotides in the targetnucleotide sequence is selected to assure that the presence of a targetpolynucleotide in a sample is a specific indicator of the presence ofpolynucleotide in a sample. The maximum number of nucleotides in thetarget nucleotide sequence is normally governed by several factors: thelength of the polynucleotide from which it is derived, the tendency ofsuch polynucleotide to be broken by shearing or other processes duringisolation, the efficiency of any procedures required to prepare thesample for analysis (e.g. transcription of a DNA template into RNA) andthe efficiency of detection and/or amplification of the targetnucleotide sequence, where appropriate.

[0067] Oligonucleotide—a polynucleotide, usually single stranded,usually a synthetic polynucleotide but may be a naturally occurringpolynucleotide. The oligonucleotide(s) are usually comprised of asequence of at least 5 nucleotides, preferably, 10 to 100 nucleotides,more preferably, 20 to 50 nucleotides, and usually 10 to 30 nucleotides,more preferably, 20 to 30 nucleotides, and desirably about 25nucleotides in length.

[0068] Various techniques can be employed for preparing anoligonucleotide. Such oligonucleotides can be obtained by biologicalsynthesis or by chemical synthesis. For short sequences (up to about 100nucleotides), chemical synthesis will frequently be more economical ascompared to the biological synthesis. In addition to economy, chemicalsynthesis provides a convenient way of incorporating low molecularweight compounds and/or modified bases during a specific synthesissteps. Furthermore, chemical synthesis is very flexible in the choice oflength and region of the target polynucleotide binding sequence. Theoligonucleotide can be synthesized by standard methods such as thoseused in commercial automated nucleic acid synthesizers. Chemicalsynthesis of DNA on a suitably modified glass or resin can result in DNAcovalently attached to the surface. This may offer advantages in washingand sample handling. For longer sequences standard replication methodsemployed in molecular biology can be used such as the use of M13 forsingle stranded DNA as described by J. Messing (1983) Methods Enzymol,101:20-78.

[0069] Other methods of oligonucleotide synthesis includephosphotriester and phosphodiester methods (Narang, et al. (1979) Meth.Enzymol 68:90) and synthesis on a support (Beaucage, et al. (1981)Tetrahedron Letters 22:1859-1862) as well as phosphoramidite techniques(Caruthers, M. H., et al., “Methods in Enzymology,” Vol. 154, pp.287-314 (1988)) and others described in “Synthesis and Applications ofDNA and RNA,” S. A. Narang, editor, Academic Press, New York, 1987, andthe references contained therein. The chemical synthesis via aphotolithographic method of spatially addressable arrays ofoligonucleotides bound to glass surfaces is described by A. C. Pease, etal., Proc. Nat. Acad. Sci. USA (1994) 91:5022-5026.

[0070] Oligonucleotide probe—an oligonucleotide employed to bind to aportion of a polynucleotide such as another oligonucleotide or a targetnucleotide sequence. The design and preparation of the oligonucleotideprobes are generally dependent upon the sensitivity and specificityrequired, the sequence of the target polynucleotide and, in certaincases, the biological significance of certain portions of the targetpolynucleotide sequence.

[0071] Oligonucleotide primer(s)—an oligonucleotide that is usuallyemployed in a chain extension on a polynucleotide template such as in,for example, an amplification of a nucleic acid. The oligonucleotideprimer is usually a synthetic nucleotide that is single stranded,containing a sequence at its 3′-end that is capable of hybridizing witha defined sequence of the target polynucleotide. Normally, anoligonucleotide primer has at least 80%, preferably 90%, more preferably95%, most preferably 100%, complementarity to a defined sequence orprimer binding site. The number of nucleotides in the hybridizablesequence of an oligonucleotide primer should be such that stringencyconditions used to hybridize the oligonucleotide primer will preventexcessive random non-specific hybridization. Usually, the number ofnucleotides in the oligonucleotide primer will be at least as great asthe defined sequence of the target polynucleotide, namely, at least tennucleotides, preferably at least 15 nucleotides, and generally fromabout 10 to 200, preferably 20 to 50, nucleotides.

[0072] In general, in primer extension, amplification primers hybridizeto, and are extended along (chain extended), at least the targetnucleotide sequence within the target polynucleotide and, thus, thetarget sequence acts as a template. The extended primers are chain“extension products.” The target sequence usually lies between twodefined sequences but need not. In general, the primers hybridize withthe defined sequences or with at least a portion of such targetpolynucleotide, usually at least a ten-nucleotide segment at the 3′-endthereof and preferably at least 15, frequently a 20 to 50 nucleotidesegment thereof.

[0073] Nucleoside triphosphates—nucleosides having a 5′-triphosphatesubstituent. The nucleosides are pentose sugar derivatives ofnitrogenous bases of either purine or pyrimidine derivation, covalentlybonded to the 1′-carbon of the pentose sugar, which is usually adeoxyribose or a ribose. The purine bases include adenine (A), guanine(G), inosine (I), and derivatives and analogs thereof. The pyrimidinebases include cytosine (C), thymine (T), uracil (U), and derivatives andanalogs thereof. Nucleoside triphosphates include deoxyribonucleosidetriphosphates such as the four common deoxyribonucleoside triphosphatesdATP, dCTP, dGTP and dTTP and ribonucleoside triphosphates such as thefour common triphosphates rATP, rCTP, rGTP and rUTP.

[0074] The term “nucleoside triphosphates” also includes derivatives andanalogs thereof, which are exemplified by those derivatives that arerecognized and polymerized in a similar manner to the underivatizednucleoside triphosphates.

[0075] Nucleotide—a base-sugar-phosphate combination that is themonomeric unit of nucleic acid polymers, i.e., DNA and RNA. The term“nucleotide” as used herein includes modified nucleotides as definedbelow.

[0076] DNA—deoxyribonucleic acid.

[0077] RNA—ribonucleic acid.

[0078] Modified nucleotide—a unit in a nucleic acid polymer thatcontains a modified base, sugar or phosphate group. The modifiednucleotide can be produced by a chemical modification of the nucleotideeither as part of the nucleic acid polymer or prior to the incorporationof the modified nucleotide into the nucleic acid polymer. For example,the methods mentioned above for the synthesis of an oligonucleotide maybe employed. In another approach a modified nucleotide can be producedby incorporating a modified nucleoside triphosphate into the polymerchain during an amplification reaction. Examples of modifiednucleotides, by way of illustration and not limitation, includedideoxynucleotides, derivatives or analogs that are biotinylated, aminemodified, alkylated, fluorophore-labeled, and the like and also includephosphorothioate, phosphite, ring atom modified derivatives, and soforth.

[0079] Nucleoside—is a base-sugar combination or a nucleotide lacking aphosphate moiety.

[0080] Nucleotide polymerase—a catalyst, usually an enzyme, for formingan extension of a polynucleotide along a DNA or RNA template where theextension is complementary thereto. The nucleotide polymerase is atemplate dependent polynucleotide polymerase and utilizes nucleosidetriphosphates as building blocks for extending the 3′-end of apolynucleotide to provide a sequence complementary with thepolynucleotide template. Usually, the catalysts are enzymes, such as DNApolymerases, for example, prokaryotic DNA polymerase (I, II, or III), T4DNA polymerase, T7 DNA polymerase, Klenow fragment, reversetranscriptase, Vent DNA polymerase, Pfu DNA polymerase, Taq DNApolymerase, and the like, or RNA polymerases, such as T3 and T7 RNApolymerases. Polymerase enzymes may be derived from any source such ascells, bacteria such as E. coli, plants, animals, virus, thermophilicbacteria, and so forth.

[0081] Amplification of nucleic acids or polynucleotides—any method thatresults in the formation of one or more copies of a nucleic acid orpolynucleotide molecule (exponential amplification) or in the formationof one or more copies of only the complement of a nucleic acid orpolynucleotide molecule (linear amplification).

[0082] Hybridization (hybridizing) and binding—in the context ofnucleotide sequences these terms are used interchangeably herein. Theability of two nucleotide sequences to hybridize with each other isbased on the degree of complementarity of the two nucleotide sequences,which in turn is based on the fraction of matched complementarynucleotide pairs. The more nucleotides in a given sequence that arecomplementary to another sequence, the more stringent the conditions canbe for hybridization and the more specific will be the binding of thetwo sequences. Increased stringency is achieved by elevating thetemperature, increasing the ratio of co-solvents, lowering the saltconcentration, and the like.

[0083] Hybridization efficiency—the productivity of a hybridizationreaction, measured as either the absolute or relative yield ofoligonucleotide probe/polynucleotide target duplex formed under a givenset of conditions in a given amount of time.

[0084] Homologous or substantially identical polynucleotides—In general,two polynucleotide sequences that are identical or can each hybridize tothe same polynucleotide sequence are homologous. The two sequences arehomologous or substantially identical where the sequences each have atleast 90%, preferably 100%, of the same or analogous base sequence wherethymine (T) and 30 uracil (U) are considered the same. Thus, theribonucleotides A, U, C and G are taken as analogous to thedeoxynucleotides dA, dT, dC, and dG, respectively. Homologous sequencescan both be DNA or one can be DNA and the other RNA.

[0085] Complementary—Two sequences are complementary when the sequenceof one can bind to the sequence of the other in an anti-parallel sensewherein the 3′-end of each sequence binds to the 5′-end of the othersequence and each A, T(U), G, and C of one sequence is then aligned witha T(U), A, C, and G, respectively, of the other sequence. RNA sequencescan also include complementary G/U or U/G basepairs.

[0086] Member of a specific binding pair (“sbp member”)—one of twodifferent molecules, having an area on the surface or in a cavity thatspecifically binds to and is thereby defined as complementary with aparticular spatial and polar organization of the other molecule. Themembers of the specific binding pair are referred to as cognates or asligand and receptor (antiligand). These may be members of animmunological pair such as antigen-antibody, or may beoperator-repressor, nuclease-nucleotide, biotin-avidin, hormones-hormonereceptors, nucleic acid duplexes, IgG-protein A, DNA-DNA, DNA-RNA, andthe like.

[0087] Ligand—any compound for which a receptor naturally exists or canbe prepared.

[0088] Receptor (“antiligand”)—any compound or composition capable ofrecognizing a particular spatial and polar organization of a molecule,e.g., epitopic or determinant site. Illustrative receptors includenaturally occurring receptors, e.g., thyroxine binding globulin,antibodies, enzymes, Fab fragments, lectins, nucleic acids, repressors,protection enzymes, protein A, complement component C1q, DNA bindingproteins or ligands and the like.

[0089] Oligonucleotide Properties:

[0090] Potential of an oligonucleotide to hybridize—the combination ofduplex formation rate and duplex dissociation rate that determines theamount of duplex nucleic acid hybrid that will form under a given set ofexperimental conditions in a given amount of time.

[0091] Parameter—a factor that provides information about thehybridization of an oligonucleotide with a target nucleotide sequence.Generally, the factor is one that is predictive of the ability of anoligonucleotide to hybridize with a target nucleotide sequence. Suchfactors include composition factors, thermodynamic factors,chemosynthetic efficiencies, kinetic factors, and the like.

[0092] Parameter predictive of the ability to hybridize—a parametercalculated from a set of oligonucleotide sequences wherein the parameterpositively correlates with observed hybridization efficiencies of thosesequences. The parameter is, therefore, predictive of the ability ofthose sequences to hybridize. “Positive correlation” can be rigorouslydefined in statistical terms. The correlation coefficient ρ_(x,y) of twoexperimentally measured discreet quantities x and y (N values in eachset) is defined as${\rho_{x,y} = \frac{C\quad o\quad v\quad a\quad r\quad i\quad a\quad n\quad c\quad {e( {x,y} )}}{\sqrt{V\quad a\quad r\quad i\quad a\quad n\quad c\quad {e(x)}V\quad a\quad r\quad i\quad a\quad n\quad c\quad {e(y)}}}},$

[0093] where the Covariance (x,y) is defined by${C\quad o\quad v\quad a\quad r\quad i\quad a\quad n\quad c\quad {e( {x,y} )}} = {\frac{1}{N}{\sum\limits_{j = 1}^{N}{( {x_{j} - \mu_{x}} ){( {y_{j} - \mu_{y}} ).}}}}$

[0094] The quantities μ_(x) and μ_(y) are the averages of the quantitiesx and y, while the variances are simply the squares of the standarddeviations (defined below). The correlation coefficient is adimensionless (unitless) quantity between −1 and 1. A correlationcoefficient of 1 or −1 indicates that x and y have a linear relationshipwith a positive or negative slope, respectively. A correlationcoefficient of zero indicates no relationship; for example, two sets ofrandom numbers will yield a correlation coefficient near zero.Intermediate correlation coefficients indicate intermediate degrees ofrelatedness between two sets of numbers. The correlation coefficient isa good statistical measure of the degree to which one set of numberspredicts a second set of numbers.

[0095] Composition factor—a numerical factor based solely on thecomposition or sequence of an oligonucleotide without involvingadditional parameters, such as experimentally measured nearest-neighborthermodynamic parameters. For instance, the fraction (G+C), given by theformula${f_{G\quad C} = \frac{n_{G} + n_{C}}{n_{G} + n_{C} + n_{A} + n_{T\quad o\quad r\quad U}}},$

[0096] where n_(G), n_(C), n_(A) and n_(T or U) are the numbers of G, C,A and T (or U) bases in an oligonucleotide, is an example of acomposition factor. Examples of composition factors, by way ofillustration and not limitation, are mole fraction (G+C), percent (G+C),sequence complexity, sequence information content, frequency ofoccurrence of specific oligonucleotide sequences in a sequence databaseand so forth.

[0097] Thermodynamic factor—numerical factors that predict the behaviorof an oligonucleotide in some process that has reached equilibrium. Forinstance, the free energy of duplex formation between an oligonucleotideand its complement is a thermodynamic factor. Thermodynamic factors forsystems that can be subdivided into constituent parts are oftenestimated by summing contributions from the constituent parts. Such anapproach is used to calculate the thermodynamic properties ofoligonucleotides.

[0098] Examples of thermodynamic factors, by way of illustration and notlimitation, are predicted duplex melting temperature, predicted enthalpyof duplex formation, predicted entropy of duplex formation, free energyof duplex formation, predicted melting temperature of the most stableintramolecular structure of the oligonucleotide or its complement,predicted enthalpy of the most stable intramolecular structure of theoligonucleotide or its complement, predicted entropy of the most stableintramolecular structure of the oligonucleotide or its complement,predicted free energy of the most stable intramolecular structure of theoligonucleotide or its complement, predicted melting temperature of themost stable hairpin structure of the oligonucleotide or its complement,predicted enthalpy of the most stable hairpin structure of theoligonucleotide or its complement, predicted entropy of the most stablehairpin structure of the oligonucleotide or its complement, predictedfree energy of the most stable hairpin structure of the oligonucleotideor its complement, thermodynamic partition function for intramolecularstructure of the oligonucleotide or its complement and the like.

[0099] Chemosynthetic efficiency—oligonucleotides and nucleotidesequences may both be made by sequential polymerization of theconstituent nucleotides. However, the individual addition steps are notperfect; they instead proceed with some fractional efficiency that isless than unity. This may vary as a function of position in thesequence. Therefore, what is really produced is a family of moleculesthat consists of the desired molecule plus many truncated sequences.These “failure sequences” affect the observed efficiency ofhybridization between an oligonucleotide and its complementary target.Examples of chemosynthetic efficiency factors, by way of illustrationand not limitation, are coupling efficiencies, overall efficiencies ofthe synthesis of a target nucleotide sequence or an oligonucleotideprobe, and so forth.

[0100] Kinetic factor—numerical factors that predict the rate at whichan oligonucleotide hybridizes to its complementary sequence or the rateat which the hybridized sequence dissociates from its complement arecalled kinetic factors. Examples of kinetic factors are steric factorscalculated via molecular modeling or measured experimentally, rateconstants calculated via molecular dynamics simulations, associativerate constants, dissociative rate constants, enthalpies of activation,entropies of activation, free energies of activation, and the like.

[0101] Predicted duplex melting temperature—the temperature at which anoligonucleotide mixed with a hybridizable nucleotide sequence ispredicted to form a duplex structure (double-helix hybrid) with 50% ofthe hybridizable sequence. At higher temperatures, the amount of duplexis less than 50%; at lower temperatures, the amount of duplex is greaterthan 50%. The melting temperature T_(m) (°C.) is calculated from theenthalpy (ΔH), entropy (ΔS) and C, the concentration of the mostabundant duplex component (for hybridization arrays, the solublehybridization target), using the equation${T_{m} = {\frac{\Delta \quad H}{{\Delta \quad S} + {R\quad l\quad n\quad C}} - 273.15}},$

[0102] where R is the gas constant, 1.987 cal/(mole-°K.). For longersequences (>100 nucleotides), T_(m) can also be estimated from the molefraction (G+C), χ_(G+C), using the equation

T _(m)=81.5+41.0 χ_(G+C).

[0103] Melting temperature corrected for saltconcentration—polynucleotide duplex melting temperatures are calculatedwith the assumption that the concentration of sodium ion, Na⁺, is 1 M.Melting temperatures T′_(m) calculated for duplexes formed at differentsalt concentrations are corrected via the semi-empirical equation

T′ _(m)([Na ⁺])=T _(m)+16.6 log([Na ⁺]).

[0104] Predicted enthalpy, entropy and free energy of duplexformation—the enthalpy (ΔH), entropy and free energy (ΔG) arethermodynamic state functions, related by the equation

ΔG=ΔH−T ΔS,

[0105] where T is the temperature in °K. In practice, the enthalpy andentropy are predicted via a thermodynamic model of duplex formation (the“nearest neighbor” model which is explained in more detail below), andused to calculate the free energy and melting temperature.

[0106] Predicted free energy of the most stable intramolecular structureof an oligonucleotide or its complement—single-stranded DNA and RNAmolecules that contain self-complementary sequences can formintramolecular secondary structures. For instance, the oligonucleotide5′-ACTGGCAATCACAATTGCCAGTAA-3′ (SEQ ID NO:1)

[0107] can base pair with itself, to form the structure 5′-ACTGGCAATCA(SEQ ID NO: 1)    ||||||||| C 3′-AATGACCGTTAA

[0108] where a vertical line indicates Watson-Crick base pair formation.Many such structures are possible for a given sequence; two are ofparticular interest. The first is the lowest energy “hairpin” structure(formed by folding a sequence back on itself with a connecting loop atleast 3 nucleotides long). The second is the lowest energy structurethat can be formed by including more complex topologies, such as “bulgeloops” (unpaired duplexes between two regions of base-paired duplex) andcloverleaf structures, where 3 base-paired stretches meet at atriple-junction. A good example of a complex secondary structure is thestructure of a tRNA molecule, an example of which, namely, yeasttRNA^(Ala) is shown below.

[0109] For either type of structure, a value of the free energy of thatstructure can be calculated, relative to the unpaired strand, by meansof a thermodynamic model similar to that used to calculate the freeenergy of a base-paired duplex structure. Again, the free energy ΔG iscalculated from the enthalpy ΔH and the entropy ΔS at a given absolutetemperature T via the equation

ΔG=ΔH−T ΔS.

[0110] However, in this case there is the added difficulty that thelowest energy structure must be found. For a simple hairpin structure,this optimization can be performed via a relatively simple searchalgorithm. For more complex structures (such as a cloverleaf) a dynamicprogramming algorithm, such as that implemented in the program MFOLD,must be used.

[0111] Yeast tRNA^(Ala)—The RNA sequence includes many non-standardribonucleotides, such as D (5,6 dihydrouridine), m¹G(1-methylguanosine), m²G (N²-dimethylguanosine), ψ(pseudouridine), I(inosine), m¹ (1-methylinosine) and T (ribothymidine). Dots (•) mark(non-standard) G=U base pairs. The structure is taken from A. L.Lehninger, et al., Principles of Biochemistry, 2^(nd) Ed. (WorthPublishers, New York, N.Y., 1993).                 3′ (SEQ ID NO:2)               /               A               C            5′  C             \ A             pG-C             G-C             G·U            C-G             G-C             U U             G-C      UU  DG       U    AGGCC  A C  AUGCGm¹G     |||||    G    ·|||        UCCGG  C G  AGCGC        C     Tψ  GD     m²G     D           C-GAG            U-A            C-G            C-G           C-G           U   ψ          U    m¹I           I   C            G

[0112] Coupling efficiencies—chemosynthetic efficiencies are calledcoupling efficiencies when the synthetic scheme involves successiveattachment of different monomers to a growing oligomer; a good exampleis oligonucleotide synthesis via phosphoramidite coupling chemistry.

[0113] Algorithmic Operations:

[0114] Evaluating a parameter—determination of the numerical value of anumerical descriptor of a property of an oligonucleotide sequence bymeans of a formula, algorithm or look-up table.

[0115] Filter—a mathematical rule or formula that divides a set ofnumbers into two subsets. Generally, one subset is retained for furtheranalysis while the other is discarded. If the division into two subsetsis achieved by testing the numbers against a simple inequality, then thefilter is referred to as a “cut-off”. In the context of the currentinvention, an example by way of illustration and not limitation is thestatement “The predicted self structure free energy must be greater thanor equal to −0.4 kcal/mole,” which can be used as a filter foroligonucleotide sequences; this particular filter is also an example ofa cut-off.

[0116] Filter set—A set of rules or formulae that successively winnow aset of numbers by identifying and discarding subsets that do not meetspecific criteria. In the context of the current invention, an exampleby way of illustration and not limitation is the compound statement “thepredicted self structure free energy must be greater than or equal to−0.4 kcal/mole and the predicted RNA/DNA heteroduplex meltingtemperature must lie between 60° C. and 85° C.,” which can be used as afilter set for oligonucleotide sequences.

[0117] Examining a parameter—comparing the numerical value of aparameter to some cutoff-value or filter.

[0118] Statistical sampling of a cluster—extraction of a subset ofoligonucleotides from a cluster of oligonucleotides based upon somestatistical measure, such as rank by oligonucleotide starting positionin the sequence complementary to the target sequence.

[0119] First quartile, median and third quartile—If a set of numbers isranked by value, then the value that divides the lower ¼ from the upper¾ of the set is the first quartile, the value that divides the set inhalf is the median and the value that divides the lower ¾ from the upper¼ of the set is the third quartile.

[0120] Poorly correlated—If it is not possible to perform a “good”prediction, as defined via statistics, of one set of numbers fromanother set of numbers using a simple linear model, then the two sets ofnumbers are said to be poorly correlated.

[0121] Computer program—a written set of instructions that symbolicallyinstructs an appropriately configured computer to execute an algorithmthat will yield desired outputs from some set of inputs. Theinstructions may be written in one or several standard programminglanguages, such as C, C++, Visual BASIC, FORTRAN or the like.Alternatively, the instructions may be written by imposing a templateonto a general-purpose numerical analysis program, such as aspreadsheet.

[0122] Experimental System Components:

[0123] Small organic molecule—a compound of molecular weight less than1500, preferably 100 to 1000, more preferably 300 to 600 such as biotin,fluorescein, rhodamine and other dyes, tetracycline and other proteinbinding molecules, and haptens, etc. The small organic molecule canprovide a means for attachment of a nucleotide sequence to a label or toa support.

[0124] Support or surface—a porous or non-porous water insolublematerial. The surface can have any one of a number of shapes, such asstrip, plate, disk, rod, particle, including bead, and the like. Thesupport can be hydrophilic or capable of being rendered hydrophilic andincludes inorganic powders such as glass, silica, magnesium sulfate, andalumina; natural polymeric materials, particularly cellulosic materialsand materials derived from cellulose, such as fiber containing papers,e.g., filter paper, chromatographic paper, etc.; synthetic or modifiednaturally occurring polymers, such as nitrocellulose, cellulose acetate,poly (vinyl chloride), polyacrylamide, cross linked dextran, agarose,polyacrylate, polyethylene, polypropylene, poly(4-methylbutene),polystyrene, polymethacrylate, poly(ethylene terephthalate), nylon,poly(vinyl butyrate), etc.; either used by themselves or in conjunctionwith other materials; glass available as Bioglass, ceramics, metals, andthe like. Natural or synthetic assemblies such as liposomes,phospholipid vesicles, and cells can also be employed.

[0125] Binding of oligonucleotides to a support or surface may beaccomplished by well-known techniques, commonly available in theliterature. See, for example, A. C. Pease, et al., Proc. Nat. Acad. Sci.USA, 91:5022-5026 (1994).

[0126] Label—a member of a signal producing system. Usually the label ispart of a target nucleotide sequence or an oligonucleotide probe, eitherbeing conjugated thereto or otherwise bound thereto or associatedtherewith. The label is capable of being detected directly orindirectly. Labels include (i) reporter molecules that can be detecteddirectly by virtue of generating a signal, (ii) specific binding pairmembers that may be detected indirectly by subsequent binding to acognate that contains a reporter molecule, (iii) oligonucleotide primersthat can provide a template for amplification or ligation or (iv) aspecific polynucleotide sequence or recognition sequence that can act asa ligand such as for a repressor protein, wherein in the latter twoinstances the oligonucleotide primer or repressor protein will have, orbe capable of having, a reporter molecule. In general, any reportermolecule that is detectable can be used.

[0127] The reporter molecule can be isotopic or nonisotopic, usuallynon-isotopic, and can be a catalyst, such as an enzyme, a polynucleotidecoding for a catalyst, promoter, dye, fluorescent molecule,chemiluminescent molecule, coenzyme, enzyme substrate, radioactivegroup, a small organic molecule, amplifiable polynucleotide sequence, aparticle such as latex or carbon particle, metal sol, crystallite,liposome, cell, etc., which may or may not be further labeled with adye, catalyst or other detectable group, and the like. The reportermolecule can be a fluorescent group such as fluorescein, achemiluminescent group such as luminol, a terbium chelator such asN-(hydroxyethyl) ethylenediaminetriacetic acid that is capable ofdetection by delayed fluorescence, and the like.

[0128] The label is a member of a signal producing system and cangenerate a detectable signal either alone or together with other membersof the signal producing system. As mentioned above, a reporter moleculecan be bound directly to a nucleotide sequence or can become boundthereto by being bound to an sbp member complementary to an sbp memberthat is bound to a nucleotide sequence. Examples of particular labels orreporter molecules and their detection can be found in U.S. Pat. No.5,508,178 issued Apr. 16, 1996, at column 11, line 66, to column 14,line 33, the relevant disclosure of which is incorporated herein byreference. When a reporter molecule is not conjugated to a nucleotidesequence, the reporter molecule may be bound to an sbp membercomplementary to an sbp member that is bound to or part of a nucleotidesequence.

[0129] Signal Producing System—the signal producing system may have oneor more components, at least one component being the label. The signalproducing system generates a signal that relates to the presence oramount of a target polynucleotide in a medium. The signal producingsystem includes all of the reagents required to produce a measurablesignal. Other components of the signal producing system may be includedin a developer solution and can include substrates, enhancers,activators, chemiluminescent compounds, cofactors, inhibitors,scavengers, metal ions, specific binding substances required for bindingof signal generating substances, and the like. Other components of thesignal producing system may be coenzymes, substances that react withenzymic products, other enzymes and catalysts, and the like. The signalproducing system provides a signal detectable by external means, by useof electromagnetic radiation, desirably by visual examination.Signal-producing systems that may be employed in the present inventionare those described more fully in U.S. Pat. No. 5,508,178, the relevantdisclosure of which is incorporated herein by reference.

[0130] Ancillary Materials—Various ancillary materials will frequentlybe employed in the methods and assays utilizing oligonucleotide probesdesigned in accordance with the present invention. For example, buffersand salts will normally be present in an assay medium, as well asstabilizers for the assay medium and the assay components. Frequently,in addition to these additives, proteins may be included, such asalbumins, organic solvents such as formamide, quaternary ammonium salts,polycations such as spermine, surfactants, particularly non-ionicsurfactants, binding enhancers, e.g., polyalkylene glycols, or the like.

DETAILED DESCRIPTION OF THE INVENTION

[0131] The invention is directed to methods or algorithms for predictingoligonucleotides specific for a nucleic acid target where theoligonucleotides exhibit a high potential for hybridization. Thealgorithm uses parameters of the oligonucleotide and theoligonucleotide/target nucleotide sequence duplex, which can be readilypredicted from the primary sequences of the target polynucleotide andcandidate oligonucleotides. In the methods of the present invention,oligonucleotides are filtered based on one or more of these parameters,then further filtered based on the sizes of clusters of oligonucleotidesalong the input polynucleotide sequence. The methods or algorithms ofthe present invention may be carried out using either relatively simpleuser-written subroutines or publicly available stand-alone softwareapplications (e.g., dynamic programming algorithm for calculatingself-structure free energies of oligonucleotides). The parametercalculations may be orchestrated and the filtering algorithms may beimplemented using any of a number of commercially available computerprograms as a framework such as, e.g., Microsoft® Excel spreadsheet,Microsoft® Access relational database and the like. The basic stepsinvolved in the present methods involve parsing a sequence that iscomplementary to a target nucleotide sequence into a set of overlappingoligonucleotide sequences, evaluating one or more parameters for each ofthe oligonucleotide sequences, said parameter or parameters beingpredictive of probe hybridization to the target nucleotide sequence,filtering the oligonucleotide sequences based on the values for eachparameter, filtering the oligonucleotide sequences based on the lengthof contiguous sequence elements and ranking the contiguous sequenceelements based on their length. We have found that oligonucleotides inthe longest contiguous sequence elements generally show the highesthybridization efficiencies.

[0132] The present methods are based on our recognition thatoligonucleotides showing high hybridization efficiencies tend to formclusters. It is believed that this clustering reflects local regions ofthe target nucleotide sequence that are unstructured and accessible foroligonucleotide binding. Oligonucleotides that are contiguous along aregion of the input nucleic acid sequence are identified. Theseoligonucleotides are sorted based on the length of the contiguoussequence elements. The sorting approach used in the present inventionapparently serves as a surrogate for the calculation of local secondarystructure of the target nucleotide sequence. This is supported by ourobservation that treatments intended to eliminate long-range nucleicacid structure (e.g., random fragmentation) do not eliminate thedifferences in hybridization yields across oligonucleotide probe arrays.This implies that major determinants of efficient hybridization arelocal regions of the target sequence. The identification of contiguoussequence elements is a simple and efficient method for recognizingclusters of such determinants and, thus, for identifying oligonucleotideprobes that exhibit high hybridization efficiency for a targetnucleotide sequence.

[0133] As mentioned above one embodiment of the present invention is amethod for predicting the potential of an oligonucleotide to hybridizeto a target nucleotide sequence. A predetermined number of uniqueoligonucleotides is identified. The length of the oligonucleotides maybe the same or different. The oligonucleotides are unique in that no twoof the oligonucleotides are identical. The unique oligonucleotides arechosen to sample the entire length of a nucleotide sequence that ishybridizable with the target nucleotide sequence. The actual number ofoligonucleotides is generally determined by the length of the nucleotidesequence and the desired result. The number of oligonucleotides shouldbe sufficient to achieve a consensus behavior. In other words, theoligonucleotide sequences should be sufficiently numerous that severalpossible probes overlap or fall within a given region that is expectedto yield acceptable hybridization efficiency. Since the location ofthese regions is not known before hand, the best strategy is to equallyspace the probe sequences along the sequence that is hybridizable to thetarget sequence. Since regions of acceptable hybridization efficiencyare generally on the order of 20 nucleotides in length, a practicalstrategy is to space the starting nucleotides of the oligonucleotidesequences no more than five basepairs apart. If computation time neededto calculate the predictive parameters is not an issue, then the beststrategy is to space the starting nucleotides one nucleotide apart. Animportant feature of the present invention is to determineoligonucleotides that are clustered along a region of the nucleotidesequence. The individual predictions made for individual oligonucleotidesequences are not very good. However, we have found that the predictionsthat are experimentally observed tend to form contiguous clusters, whilethe spurious predictions tend to be solitary. Thus, the number ofoligonucleotides should be sufficient to achieve the desired clustering.

[0134] Preferably, a set of overlapping sequences is chosen. To thisend, the subsequences are chosen so that there is overlap of at leastone nucleotide from one oligonucleotide to the next. More preferably,the overlap is two or more nucleotides. Most preferably, theoligonucleotides are spaced one nucleotide apart and the predeterminednumber is L-N+1 oligonucleotides where L is the length of the nucleotidesequence and N is the length of the oligonucleotides. In the lattersituation, the unique oligonucleotides are of identical length N. Thus,a set of overlapping oligonucleotides is a set of oligonucleotides thatare subsequences derived from some master sequence by subdividing thatsequence in such a way that each subsequence contains either the startor end of at least one other subsequence in the set.

[0135] An example of the above for purposes of illustration and notlimitation is presented by the sequence ATGGACTTAGCATTCG (SEQ ID NO:3),from which the following set of overlapping oligonucleotides can beidentified: ATGGACTTAGCA (SEQ ID NO:4)  TGGACTTAGCAT (SEQ ID NO:5)  GGACTTAGCATT (SEQ ID NO:6)    GACTTAGCATTC (SEQ ID NO:7)    ACTTAGCATTCG (SEQ ID NO:8)

[0136] In this example the overlapping oligonucleotides are spaced onenucleotide apart. In other words, there is overlap of all but onenucleotide from one oligonucleotide to the next. In the example above,the original nucleotide sequence is 16 nucleotides long (L=16). Thelength of each of the overlapping oligonucleotides is 12 nucleotideslong (N=12) and there are L−N+1=5 oligonucleotides.

[0137] The length of the oligonucleotides may be the same or differentand may vary depending on the length of the nucleotide sequence. Thelength of the oligonucleotides is determined by a practical compromisebetween the limits of current chemistries for oligonucleotide synthesisand the need for longer oligonucleotides, which exhibit greater bindingaffinity for the target sequence and are more likely to occur only oncein complicated mixtures of polynucleotide targets. Usually, the lengthof the oligonucleotides is from about 10 to 50 nucleotides, moreusually, from about 25 to 35 nucleotides.

[0138] In the next step of the method at least one parameter that isindependently predictive of the ability of each of the oligonucleotidesof the set to hybridize to the target nucleotide sequence is determinedand evaluated for each of the above oligonucleotides. Examples of such aparameter, by way of illustration and not limitation, is a parameterselected from the group consisting of composition factors, thermodynamicfactors, chemosynthetic efficiencies, kinetic factors and mathematicalcombinations of these quantities.

[0139] The determination of a parameter may be carried out by knownmethods. For example, melting temperature of the oligonucleotide/targetduplex may be determined using the nearest neighbor method andparameters appropriate for the nucleotide acids involved. For DNA/DNAparameters, see J. SantaLucia Jr., et al., (1996) Biochemistry, 35:3555.For RNA/DNA parameters, see N. Sugimoto, et al., (1995) Biochemistry,34:11211. Briefly, these methods are based on the observation that thethermodynamics of a nucleic acid duplex can be modeled as the sum of aterm arising from the entire duplex and a set of terms arising fromoverlapping pairs of nucleotides (“nearest neighbor” model). For adiscussion of the nearest neighbor see J. SantaLucia Jr., et al., (1996)Biochemistry, supra, and N. Sugimoto, et al., (1995) Biochemistry,supra. For example, the enthalpy ΔH of the duplex formed by the sequenceATGGACTTAGCA (SEQ ID NO:4)

[0140] and its perfect complement can be approximated by the equation

ΔH≈H _(unit) +H _(AT) +H _(TG) +H _(GG) +H _(GA) +H _(AC) +H _(CT) +H_(TT) +H _(TA) +H _(AG) +H _(GC) +H _(CA).

[0141] In the above equation, the term H_(init) is the initiationenthalpy for the entire duplex, while the terms H_(AT), . . . , H_(CA)are the so-called “nearest neighbor” enthalpies. Similar equations canbe written for the entropy, for the corresponding quantities for RNAhomoduplexes, or for DNA/RNA heteroduplexes. The free energy can then becalculated from the enthalpy, entropy and absolute temperature, asdescribed previously.

[0142] Predicted free energy of the most stable intramolecular structureof an oligonucleotide (ΔG_(MFOLD)) may be determined using the nucleicacid folding algorithm MFOLD and parameters appropriate for theoligonucleotide, e.g., DNA or RNA. For MFOLD, see J. A. Jaeger, et al.,(1989), supra. For DNA folding parameters, see J. SantaLucia Jr., etal., (1996), supra. Briefly, these methods operate in two steps. First,a map of all possible compatible intramolecular base pairs is made.Second, the global minimum of the free energy of the various possiblebase pairing configurations is found, using the nearest neighbor modelto estimate the enthalpy and entropy, the user input temperature tocomplete the calculation of free energy, and a dynamic programmingalgorithm to find the global minimum. The algorithm is computationallyintensive; calculation times scale as the third power of the sequencelength.

[0143] The following Table 1 summarizes groups of parameters that areindependently predictive of the ability of each of the oligonucleotidesto hybridize to the target nucleotide sequence together with a referenceto methods for their determination. Parameters within a given group areknown or expected to be strongly correlated to one another, whileparameters in different groups are known or expected to be poorlycorrelated with one another. TABLE 1 Group Parameter Source or ReferenceI duplex enthalpy, ΔH Santa Lucia et al., 1996; Sugimoto et al., 1995duplex entropy, ΔS Santa Lucia et al., 1996; Sugimoto et al., 1995duplex free energy, ΔG ΔG = ΔH − TΔS (see text) melting temperature,T_(m) (see text) mole fraction (or percent) G + C self-explanatorysubsequence duplex enthalpy Santa Lucia et al., 1996; Sugimoto et al.,1995 subsequence duplex entropy Santa Lucia et al., 1996; Sugimoto etal., 1995 subsequence duplex free energy ΔG = ΔH − TΔS (see text)subsequence duplex T_(m) (see text) subsequence duplex mole fractionself-explanatory (or percent) G + C II intramolecular enthalpy,ΔH_(MFOLD) Jaeger et al., 1989; Santa Lucia et al., 1996 intramolecularentropy, ΔS_(MFOLD) Jaeger et al., 1989; Santa Lucia et al., 1996intramolecular free energy, ΔG_(MFOLD) ΔG = ΔH − TΔS (see text) hairpinenthalpy, ΔH_(hairpin) Jaeger et al., 1989; Santa Lucia et al., 1996hairpin entropy, ΔS_(hairpin) Jaeger et al., 1989; Santa Lucia et al.,1996 hairpin free energy, ΔG_(hairpin) ΔG = ΔH − TΔS (see text)intramolecular partition function, Z$Z = {\sum\limits_{k\quad {structures}}{\exp ( {{- \Delta}\quad {G_{intramolecular}^{(k)}/{RT}}} )}}$

III sequence complexity Altschul et al., 1994 sequence informationcontent Altschul et al., 1994 IV steric factors molecular modeling orexperiment molecular dynamic simulation Weber & Hefland, 1979 enthalpy,entropy & free energy of measured experimentally activation association& dissociation rates Patzel & Sczakiel, 1998 V oligonucleotidechemosynthetic measured experimentally efficiencies VI target syntheticefficiencies measured experimentally

[0144] In a next step of the present method, a subset ofoligonucleotides within the predetermined number of uniqueoligonucleotides is identified based on the above evaluation of theparameter. A number of mathematical approaches may be followed to sortthe oligonucleotides based on a parameter. In one approach a cut-offvalue is established. The cut-off value is adjustable and can beoptimized relative to one or more training data sets. This is done byfirst establishing some metric for how well a cutoff value isperforming; for example, one might use the normalized signal observedfor each oligonucleotide in the training set. Once such a metric isestablished, the cutoff value can be numerically optimized to maximizethe value of that metric, using optimization algorithms well known tothe art. Alternatively, the cutoff value can be estimated usinggraphical methods, by graphing the value of the metric as a function ofone or more parameters, and then establishing cutoff values that bracketthe region of the graph where the chosen metric exceeds some chosenthreshold value. In essence, the cut off values are chosen so that therule set used yields training data that maximizes the inclusion ofoligonucleotides that exhibit good hybridization efficiency andminimizes the inclusion of oligonucleotides that exhibit poorhybridization efficiency.

[0145] A preferred approach to performing such a graph-basedoptimization of filter parameters is shown in FIG. 3. In FIG. 3,hybridization data from several different genes have been used toprepare a contour plot of relative hybridization intensity as a functionof DNA/RNA heteroduplex melting temperature and free energy of the moststable intramolecular structure of the probe. Contours are shown onlyfor regions for which there are data; the white space outside of theoutermost contour indicates that there are no experimental data for thatregion. The details of how the data were obtained can be found inExample 1 below. A summary of the sequences and number of data pointsemployed is shown in Table 2 below. The measured hybridizationintensities for each data set were normalized prior to construction ofthe contour plot depicted in FIG. 3 by dividing each observed intensityby the maximum intensity observed for that gene. In addition,differences in hybridization salt concentrations and hybridizationtemperatures were accounted for by using the saltconcentration-corrected values of the melting temperatures and bysubtracting the hybridization temperature from each predicted meltingtemperature, respectively. The filter set determined by examination ofFIG. 3 is indicated by both the dotted open box in the figure and by theinequalities above the box.

[0146] One way in which such a contour plot may be prepared involves theuse of an appropriate software application such as Microsoft® Excel® orthe like. For example, the cross-tabulation tool may be used in theMicrosoft® Excel® program. Data is accumulated into rectangular binsthat are 0.5 kcal ΔG_(MFOLD) wide and 2.5° C. T_(m) wide. In each binthe average values of ΔG_(MFOLD), T_(m)−T_(hyb), and the normalizedhybridization intensity are calculated. The data is output to thesoftware application DeltaGraph® (Deltapoint, Inc., Monterey, Calif.)and the contour plot is prepared using the tools and instructionsprovided. TABLE 2 Target (GenBank Target No. Data [Na⁺] Accession No.)Strand Points T_(hyb) Correction HIV protease-reverse Sense 1,022 35° C.−1.4° C. transcriptase (PRT)^(a) (M15654) HIV protease-reverse antisense1,041 30° C. −1.4° C. transcriptase (PRT)^(a) (M15654) HIVprotease-reverse Sense 88 35° C. −1.4° C. transcriptase (PRT)^(b)(M15654) Human G3PDH antisense 93 35° C. −1.4° C. (glyceraldehyde-3-dehydrogenase)^(b) (X01677) Human p53^(b) (X02469) antisense 93 35° C.−1.4° C. Rabbit β-globin^(c) (K03256) antisense 106 30° C.   0° C.

[0147] Once the cut-off value is selected, a subset of oligonucleotideshaving parameter values greater than or equal to the cut-off value isidentified. This refers to the inclusion of oligonucleotides in a subsetbased on whether the value of a predictive parameter satisfies aninequality.

[0148] Examples of identifying a subset of oligonucleotides byestablishing cut-off values for predictive parameters are as follows:for melting temperature an inequality might be 60° C.≦T_(m); forpredicted free energy an inequality, preferably, might be${\Delta \quad G_{M\quad F\quad O\quad L\quad D}} \geq {{- 0.4}{\frac{kcal}{m\quad o\quad l\quad e}.}}$

[0149] In a variation of the above, both a maximum and a minimum cut-offvalue may be selected. A subset of oligonucleotides is identified whosevalues fall within the maximum and minimum values, i.e., values greaterthan or equal to the minimum cut-off value and less than or equal to themaximum cut-off value. An example of this approach for meltingtemperature might be the inequality 60° C.≦T_(m)≦85° C.

[0150] With regard to cut off values for T_(m) the lower limit is mostimportant, and is preferably T_(m)=T_(hyb), more preferably,T_(m)=T_(hyb)+15° C. The upper cutoff is important when the sequenceregion under consideration is unusually rich in G and C, and ispreferably T_(m)=T_(hyb)+40° C. With regard to ΔG_(MFOLD) the cutoffvalue is usually greater than or equal to −1.0 kcal/mole. As mentionedabove, the cutoff values preferably are determined from real datathrough experimental observations.

[0151] In another approach the parameter values may be converted intodimensionless numbers. The parameter value is converted into adimensionless number by determining a dimensionless score for eachparameter resulting in a distribution of scores having a mean value ofzero and a standard deviation of one. The dimensionless score is anumber that is used to rank some object (such as an oligonucleotide) towhich that score relates. A score that has no units (i.e., a purenumber) is called a dimensionless score.

[0152] In one approach the following equations are used for convertingthe values of said parameters into dimensionless numbers:${s_{i,x} = \frac{x_{i} - {\langle x\rangle}}{\sigma_{\{ x\}}}},$

[0153] where s_(i,x) is the dimensionless score derived from parameter xcalculated for oligonucleotide i, x_(i) is the value of parameter xcalculated for oligonucleotide i, <x> is the average of parameter xcalculated for all of the oligonucleotides under consideration for agiven nucleotide sequence target, and σ_({x}) is the standard deviationof parameter x calculated for all of the oligonucleotides underconsideration for a given nucleotide sequence target, and is given bythe equation${\sigma_{\{ x\}} = \sqrt{\frac{\sum\limits_{j = 1}^{M}( {x_{j} - {\langle x\rangle}} )^{2}}{M - 1}}},$

[0154] where M is the number of oligonucleotides. The resultingdistribution of scores, {s} has a mean value of zero and a standarddeviation of one. These properties can be important for a combination ofthe scores discussed below.

[0155] The use of a dimensionless number approach may further includecalculating a combination score S_(i) by evaluating a weighted averageof the individual values of the dimensionless scores s_(i,x) by theequation: ${S_{i} = {\sum\limits_{\{ x\}}{q_{x}s_{i,x}}}},$

[0156] where q_(x) is the weight assigned to the score derived fromparameter x, the individual values of q_(x) are always greater thanzero, and the sum of the weights q_(x) is unity.

[0157] In another variation of the above approach, the method ofcalculation of the composite parameter is optimized based on thecorrelation of the individual composite scores to real data, asexplained more fully below.

[0158] In one approach the calculation of the composite score furtherinvolves determining a moving window-averaged combination score <S_(i)>for the ith probe by the equation:${{\langle S_{i}\rangle} = {\frac{1}{w}{\sum\limits_{j = {i - \frac{w - 1}{2}}}^{i + \frac{w - 1}{2}}S_{J}}}},$

[0159] w=an odd integer,

[0160] where w is the length of the window for averaging (i.e., wnucleotides long), and then applying a cutoff filter to the value of<S_(i)>. This procedure results in smoothing (smoothing procedure) byturning each score into a consensus metric for a set of w adjacentoligonucleotide probes. The score, referred to as the “smoothed score,”is essentially continuous rather than a few discrete values. The valueof the smoothed score is strongly influenced by clustering of scoreswith high or low values; window averaging therefore provides ameasurement of cluster size.

[0161] An advantage of the dimensionless score approach to the probeprediction algorithm is that it is easy to objectively optimize. In oneapproach to training the algorithm, optimization of the weights q_(x)above may be performed by varying the values of the weights so that thecorrelation coefficient ρ_({<Si>},{Vi}) between the set ofwindow-averaged combination scores {<S_(i)>} and a set of calibrationexperimental measurements {V_(i)} is maximized. The correlationcoefficient ρ_({<Si>},{Vi}) is calculated from the equation${\rho_{{\{{\langle S_{i}\rangle}\}},{\{ V_{i}\}}} = {( \frac{1}{M} )\frac{C\quad o\quad v\quad a\quad r\quad i\quad a\quad n\quad c\quad {e( {{\langle S\rangle},V} )}}{\sigma_{\{{\langle S_{i}\rangle}\}}\sigma_{\{ V_{i}\}}}}},$

[0162] where M is the number of window averaged, combinationdimensionless scores and the number of corresponding measurements, thecovariance is as defined earlier (see earlier equations) and σ_({<Si>})and σ_({Vi}) are the standard deviations of {<S_(i)>} and {V_(i)}, asdefined previously. An example of this approach is shown in Example 2,below.

[0163] In another approach the parameter is derived from one or morefactors by mathematical transformation of the factors. This involves thecalculation of a new predictive parameter from one or more existingpredictive parameters, by means of an equation. For instance, theequilibrium constant K_(open) for formation of an oligonucleotide withno intramolecular structure from its structured form can be calculatedfrom the intramolecular structure free energy ΔG_(MFOLD), using theequation:$K_{open} = {{\exp ( \frac{\Delta \quad G_{MFOLD}}{RT} )}.}$

[0164] In a next step of the method oligonucleotides in the subset arethen identified that are clustered along a region of the nucleotidesequence that is hybridizable to the target nucleotide sequence. Forexample, consider a set of overlapping oligonucleotides identified bydividing a nucleotide sequence into subsequences. A subset of theoligonucleotides is obtained as described above. In general, this subsetis obtained by applying a rule that rejects some members of the set. Forthe remaining members of the set, namely, the subset, there will be someaverage number of nucleotides in the nucleotide sequence between thefirst nucleotides of adjacent remaining subsequences. If, for somesub-region of the nucleotide sequence, the average number of nucleotidesin the nucleotide sequence between the first nucleotides of adjacentremaining subsequences is less than the average for the entirenucleotide sequence, then the oligonucleotides are clustered. Thesmaller the average number of nucleotides between the first nucleotidesof adjacent oligonucleotides, the stronger the clustering. The strongestclustering occurs when there are no intervening nucleotides betweenadjacent starting nucleotides. In this case, the oligonucleotides aresaid to be contiguous and may be referred to as contiguous sequenceelements or “contigs.”

[0165] Accordingly, in this step oligonucleotides are sorted based onlength of contiguous sequence elements. Oligonucleotides in the subsetdetermined above are identified that are contiguous along a region ofthe input nucleic acid sequence. The length of each contig that is equalto the number of oligonucleotides in each contig, namely,oligonucleotides from the above step whose complement begin at positionsm+1, m+2 . . . . , m+k in the target sequence, form a contig of lengthk. Contigs can be identified and contig length can be calculated using,for example, a Visual Basic ® module that can be incorporated into aMicrosoft® Excel workbook.

[0166] Cluster size can be defined in several ways:

[0167] For contiguous clusters, the size is simply the number ofadjacent oligonucleotides in the cluster. Again, this may also bereferred to as contiguous sequence elements. The number may also bereferred to as “contig length”. For example, consider the nucleotidesequence discussed above, namely, ATGGACTTAGCATTCG (SEQ ID NO:3) and theidentified set of overlapping oligonucleotides ATGGACTTAGCA (SEQ IDNO:4)  TGGACTTAGCAT (SEQ ID NO:5)   GGACTTAGCATT (SEQ ID NO:6)   GACTTAGCATTC (SEQ ID NO:7)     ACTTAGCATTCG (SEQ ID NO:8)

[0168] Suppose that, after calculation and evaluation of the predictiveparameters, four nucleotides remain:

[0169] A “contig” encompassing three of the oligonucleotides of thesubset is present together with a single oligonucleotide. The contiglength is 3 oligonucleotides.

[0170] Alternatively, cluster size at some position in the sequencehybridizable or complementary to the target sequence may be defined asthe number of oligonucleotides whose center nucleotides fall inside aregion of length M centered about the position in question, divided byM. This definition of clustering allows small gaps in clusters. In theexample used above for contiguous clusters, if M was 10, then thecluster size would step through the values 0/10, . . . , 0/10, 1/10,2/10, 3/10, 3/10, 4/10, 4/10, 4/10, 4/10, 4/10, 3/10, 2110, 1110, 1/10,0/10 as the center of the window of length 10 passed through thecluster. In each fraction, the numerator is the number ofoligonucleotide sequences that have satisfied the filter set and whosecentral nucleotides are within a window 10 nucleotides long, centeredabout the nucleotide under consideration. The denominator (10) is simplythe window length.

[0171] Another alternative is to define the size of a cluster at someposition in the sequence hybridizable or complementary to the targetsequence as the number of oligonucleotide sequences overlapping thatposition. This definition is equivalent to the last definition with Mset equal to the oligonucleotide probe length and omission of thedivision by M.

[0172] Finally, cluster size can be approximated at each position in anucleotide sequence by dividing the sequence into oligonucleotides,evaluating a numerical score for each oligonucleotide, and thenaveraging the scores in the neighborhood of each position by means of amoving window average as described above. Window averaging has theeffect of reinforcing clusters of high or low values around a particularposition, while canceling varying values about that position. The windowaverage, therefore, provides a score that is sensitive to both thehybridization potential of a given oligonucleotide and the hybridizationpotentials of its neighbors.

[0173] In a next step of the present method, the oligonucleotides in thesubset are ranked. Generally, this ranking is based on the lengths ofthe clusters or contigs, sizes of the clusters or values of a windowaveraged score. Oligonucleotides found in the longest contigs or largestclusters, or possessing the highest window averaged scores usually showthe highest hybridization efficiencies. Often, the highest signalintensity within the cluster corresponds to the median oligonucleotideof the cluster. However, the peak signal intensity within the contig canbe determined experimentally, by sampling the cluster at its firstquartile, midpoint and third quartile, measuring the hybridizationefficiencies of the sampled oligonucleotides, interpolating orextrapolating the results, predicting the position of the optimal probe,and then iterating the probe design process.

[0174]FIG. 1 shows a diagram of an example of the above-described methodby way of illustration and not limitation. Referring to FIG. 1 a targetsequence of length L from, e.g., a database, is used to generate asequence that is hybridizable to the target sequence from whichcandidate oligonucleotide probe sequences are generated. One or moreparameters are calculated for each of the oligonucleotide probesequences. The candidate oligonucleotide probe sequences are filteredbased on the values of the parameters. Clustering of the filteredcandidate probe sequences is evaluated and the clusters are ranked bysize. Then, the oligonucleotide probes are statistically sampled andsynthesized. Further evaluation may be made by evaluating thehybridization of the selected oligonucleotide probes in realhybridization experiments. The above process may be reiterated tofurther define the selection. In this way only a small fraction of thepotential oligonucleotide probe candidates are synthesized and tested.This is in sharp contrast to the known method of synthesizing andtesting all or a major portion of potential oligonucleotide probes for agiven target sequence.

[0175] The methods of the present invention are preferably carried outat least in part with the aid of a computer. For example, an IBM®compatible personal computer (PC) may be utilized. The computer isdriven by software specific to the methods described herein.

[0176] The preferred computer hardware capable of assisting in theoperation of the methods in accordance with the present inventioninvolves a system with at least the following specifications: Pentium®processor or better with a clock speed of at least 100 MHz, at least 32megabytes of random access memory (RAM) and at least 80 megabytes ofvirtual memory, running under either the Windows 95 or Windows NT 4.0operating system (or successor thereof).

[0177] As mentioned above, software that may be used to carry out themethods may be either Microsoft Excel or Microsoft Access, suitablyextended via user-written functions and templates, and linked whennecessary to stand-alone programs that calculate specific parameters(e.g., MFOLD for intramolecular thermodynamic parameters). Examples ofsoftware programs used in assisting in conducting the present methodsmay be written, preferably, in Visual BASIC, FORTRAN and C++, asexemplified below in the Examples. It should be understood that theabove computer information and the software used herein are by way ofexample and not limitation. The present methods may be adapted to othercomputers and software. Other languages that may be used include, forexample, PASCAL, PERL or assembly language.

[0178]FIG. 2 depicts a more specific approach to a method in accordancewith the present invention. Referring to FIG. 2, a sequence of length Lis obtained from a database such as GenBank, UniGene or a proprietarysequence database. Probe length N is determined by the user based on therequirements for sensitivity and specificity and the limitations of theoligonucleotide synthetic scheme employed. The probe length and sequencelength are used to generate L−N+1 candidate oligonucleotide probes,i.e., from every possible starting position. An initial selection ismade based on local sequence predicted thermodynamic properties. To thisend, melting temperature T_(m) and the self-structure free energyΔG_(MFOLD), are calculated for each of the potential oligonucleotideprobe: target nucleotide sequence complexes. Next, M probes that satisfyT_(m) and ΔG_(MFOLD) filters are selected. A further selection can bemade based on clustering of “good” parameters. Good parameters areparameters that satisfy all of the filters in the filter set. Clusteringis defined by any of the methods described previously; in FIG. 2, the“contig length” definition of clustering is used.

[0179] For each of the M oligonucleotide sequences that satisfied allfilters the question is asked whether the oligonucleotide sequenceimmediately following the sequence under consideration is also one ofthe sequences that satisfied all of the filters. If the answer to thisquestion is NO, then one stores the current value of the contig lengthcounter, resets the counter to zero and proceeds to the nextoligonucleotide sequence that satisfied all filters. If the answer tothe question is YES, then 1 is added to the contig length counter and,if the counter now equals 1 (i.e., this is the first oligonucleotideprobe sequence in the contig), the starting position of theoligonucleotide is stored. One then moves to the next oligonucleotidethat satisfied all filters, which, in this case, is the same as the nextoligonucleotide before the application of the filter set. The process isrepeated until all M filtered oligonucleotide sequences have beenexamined. In this way, a single pass through the set of M filteredoligonucleotide sequences generates the lengths and starting positionsof all contigs.

[0180] Next, contigs are ranked based on the lengths of their contiguoussequence elements. Longer contig lengths generally correlate with higherhybridization efficiencies. All oligonucleotides of the higher-rankingcontigs may be considered, or candidate oligonucleotide probes may bepicked. For example, candidate oligonucleotide probes can be picked onequarter, one half and three quarters of the way through each contig. Thelatter approach provides local curvature determination afterexperimental determination of hybridization efficiencies, which allowseither interpolation or extrapolation of the positions of the nextprobes to be synthesized in order to close in on the optimal probe inthe region. If the contig brackets the actual peak of hybridizationefficiency, the process will converge in 2-3 iterations. If the contiglies to one side of the actual peak, the process will converge in 3-4iterations.

[0181] The above illustrative approach is further described withreference to the following DNA nucleotide sequence, which is thecomplement of the target RNA nucleotide sequence:GTCCAAAAAGGGTCAGTCTACCTCCCGCCATAAAAAACTCATGTTCAAGA (SEQ ID NO:9).

[0182] In the first step of the method, the nucleotide sequence isdivided into overlapping oligonucleotides that are 25 nucleotides inlength. This length is chosen because it is an effective compromisebetween the need for sensitivity (enhanced by longer oligonucleotides)and the chemosynthetic efficiency of schemes for synthesis ofsurface-bound arrays of oligonucleotide probes.

[0183] Next, the estimated duplex melting temperatures (T_(m)) andself-structure free energies (ΔG_(MFOLD)) are calculated for eacholigonucleotide in the set of overlapping oligonucleotides. The valuesare obtained from a user-written function that calculates DNA/RNAheteroduplex thermodynamic parameters (see N. Sugimoto, et al.,Biochemistry, 34:11211 (1995)) and a modified version of the programMFOLD that estimates the free energy of the most stable intramolecularstructure of a single stranded DNA molecule (see J. A. Jaeger, et al.,(1989), supra, respectively. The steps are illustrated below.GTCCAAAAAGGGTCAGTCTACCTCCCGCCATAAAAAACTCATGTTCAAGA (target complementsequence) Tm (°C) ΔG _(MFOLD) GTCCAAAAAGGGTCAGTCTACCTCC 71.77 −1.20 SEQID NO:10  TCCAAAAAGGGTCAGTCTACCTCCC 71.99 −1.20 SEQ ID NO:11  CCAAAAAGGGTCAGTCTACCTCCCG 70.78 −1.20 SEQ ID NO:12   CAAAAAGGGTCAGTCTACCTCCCGC 71.23 −1.20 SEQ ID NO:13    AAAAAGGGTCAGTCTACCTCCCGCC 73.07 −1.20 SEQ ID NO:14     AAAAGGGTCAGTCTACCTCCCGCCA 75.68 −1.20 SEQ ID NO:15      AAAGGGTCAGTCTACCTCCCGCCAT 77.53 −1.20 SEQ ID NO:16       AAGGGTCAGTCTACCTCCCGCCATA 79.03 −1.20 SEQ ID NO:17        AGGGTCAGTCTACCTCCCGCCATAA 79.03 −1.20 SEQ ID NO:18         GGGTCAGTCTACCTCCCGCCATAAA 76.85 −1.20 SEQ ID NO:19          GGTCAGTCTACCTCCCGCCATAAAA 73.10 −0.80 SEQ ID NO:20           GTCAGTCTACCTCCCGCCATAAAAA 69.50  0.90 SEQ ID NO:21            TCAGTCTACCTCCCGCCATAAAAAA 65.60  0.90 SEQ ID NO:22             CAGTCTACCTCCCGCCATAAAAAAC 64.96  0.90 SEQ ID NO:23              AGTCTACCTCCCGCCATAAAAAACT 65.  1.10 SEQ ID NO:24               GTCTACCTCCCGCCATAAAAAACTC 66.36  2.40 SEQ ID NO:25                TCTACCTCCCGCCATAAAAAACTCA 64.97  2.90 SEQ ID NO:26                 CTACCTCCCGCCATAAAAAACTCAT 63.96  2.70 SEQ ID NO:27                  TACCTCCCGCCATAAAAAACTCATG 62.58  1.10 SEQ ID NO:28                   ACCTCCCGCCATAAAAAACTCATGT 65.10  0.40 SEQ ID NO:29                    CCTCCCGCCATAAAAAACTCATGTT 64.96  0.10 SEQ ID NO:30                     CTCCCGCCATAAAAAACTCATGTTC 63.37 −0.10 SEQ ID NO:31                      TCCCGCCATAAAAAACTCATGTTCA 62.86 −0.10 SEQ ID NO:32                       CCCGCCATAAAAAACTCATGTTCAA 60.47 −0.10 SEQ IDNO:33                         CCGCCATAAAAAACTCATGTTCAAG 57.98 −0.10 SEQID NO:34                          CGCCATAAAAAACTCATGTTCAAGA 56.20 −0.10SEQ ID NO:35

[0184] Next, the oligonucleotide sequences are filtered on the basis ofT_(m). A high and low cut-off value may be selected, for example, 60°C.≦T_(m)≦85° C. Thus, oligonucleotides having T_(m) values fallingwithin the above range are retained. Those outside the range arediscarded, which is indicated below by lining out of thoseoligonucleotides and parameter values.

[0185] Next, the oligonucleotide sequences remaining after the aboveexercise are filtered on the basis of ΔG_(MFOLD) and are retained if thevalue is greater than −0.4. Those oligonucleotides with a ΔG_(MFOLD)less than −0.4 are discarded, which is indicated below by double liningout of those oligonucleotides and parameter values.

[0186] Clusters of retained oligonucleotides are identified and rankedbased on cluster size. In this example, a contiguous cluster of 13retained oligonucleotides is identified by the vertical black bar on theleft. Any or all of the oligonucleotides in this cluster may beevaluated experimentally.

[0187] Alternatively, in one approach the oligonucleotides at the firstquartile, the median and the third quartile of the cluster may beselected for experimental evaluation, indicated below by bold print.

[0188] In one aspect of the present method, at least two parameters aredetermined wherein the parameters are poorly correlated with respect toone another. The reason for requiring that the different parameterschosen are poorly correlated with one another is that an additionalparameter that is strongly correlated to the original parameter bringsno additional information to the prediction process. The correlation tothe original parameter is a strong indication that both parametersrepresent the same physical property of the system. Another way ofstating this is that correlated parameters are linearly dependent on oneanother, while poorly correlated parameters are linearly independent ofone another. In practice, the absolute value of the correlationcoefficient between any two parameters should be less than 0.5, morepreferably, less than 0.25, and, most preferably, as close to zero aspossible.

[0189] In one preferred approach instead of T_(m), for eacholigonucleotide/target nucleotide sequence duplex, the differencebetween the predicted duplex melting temperature corrected for saltconcentration and the temperature of hybridization of each of theoligonucleotides with the target nucleotide sequence is determined.

[0190] In one aspect the present method comprises determining twoparameters at least one of the parameters being the association freeenergy between a subsequence within each of the oligonucleotides and itscomplementary sequence on the target nucleotide sequence, or somesimilar, strongly correlated parameter. The object of this approach isto identify a particularly stable subsequence of the oligonucleotidethat might be capable of acting as a nucleation site for the beginningof the heteroduplex formation between the oligonucleotide and the targetnucleotide sequence. Such nucleation is believed to be the rate-limitingstep for process of heteroduplex formation.

[0191] The subsequence within the oligonucleotide is from about 3 to 9nucleotides in length, usually, 5 to 7 nucleotides in length. Thesubsequence is at least three nucleotides from the terminus of theoligonucleotide. For support-bound oligonucleotides the subsequence isat least three nucleotides from the free end of the oligonucleotide,i.e., the end that is not attached to the support. Generally, this freeend is the 5′ end of the oligonucleotide. When the oligonucleotide isattached to a support, the subsequence is at least three nucleotidesfrom the end of the oligonucleotide that is bound to the surface of thesupport to which the oligonucleotide is attached. Generally, the 3′ endof the oligonucleotide is bound to the support.

[0192] The predictive parameter can be, for example, either meltingtemperature or duplex free energy of the subsequence with the targetnucleotide sequence. The subsequence with the maximum (meltingtemperature) or minimum (free energy) value of one of the aboveparameters is chosen as the representative subsequence for thatoligonucleotide probe. For example, if the oligonucleotide is 20nucleotides in length and a subsequence of 5 nucleotides is chosen,i.e., a 5-mer, then parameter values are calculated for all 5-mersubsequences of the oligonucleotide that do not include the 2nucleotides at the free end of the oligonucleotide. Where 5′ is the freeend of the oligonucleotide with designated nucleotide number 1, thevalues are calculated for all 5-mer subsequences with startingnucleotides from position number 3 to position number 16. Thus, in thisexample, parameter values for 14 different subsequences are calculated.The subsequence with the maximum value for the parameter is thenassigned as the stability subsequence for the oligonucleotide.

[0193] The inclusion of the above determination of a stabilitysubsequence results in the following algorithm for determining thepotential of an oligonucleotide to hybridize to a target nucleotidesequence. A predetermined number of unique oligonucleotides areidentified within a nucleotide sequence that is hybridizable with saidtarget nucleotide sequence. The oligonucleotides are chosen to samplethe entire length of the nucleotide sequence. For each of theoligonucleotides, parameters that are independently predictive of theability of each of said oligonucleotides to hybridize to said targetnucleotide sequence are determined and evaluated. Two parameters thatmay be used are the thermodynamic parameters of T_(m) and ΔG_(MFOLD).These parameters give rise to associated parameter filters. In oneapproach evaluation of the parameters involves establishing cut-offvalues as described above. Application of these cut-off values resultsin the identification of a subset of oligonucleotides for furtherscrutiny under the algorithm. In accordance with this embodiment of thepresent invention, there is included a stability subsequence limit inaddition to the above. Cutoff values are determined either by means ofobjective optimization algorithms well known to the art or via graphicalestimation methods; both approaches have been described previously inthis document. In either case, the optimization of cutoff valuesinvolves comparison of predictions to known hybridization efficiencydata sets. This process results in objective optimization as it looks atprediction versus experimental results and is otherwise referred toherein as “training the algorithm.” The experimental data used to trainthe algorithm is referred to herein as “training data.”

[0194] In the present approach filters are assigned to the T_(m)oligonucleotide probe data. The T_(m) of each oligonucleotide probeneeds to be greater than or equal to the assigned filter (T_(m) probelimit) to be given a filter score of “1”; otherwise, the filter score is“0”. In addition, one can also impose a second filter for thisparameter; that is, that the T_(m) of the oligonucleotide probe also hasto be less than a defined upper limit. Filters are also assigned to theΔG_(MFOLD) data. The ΔG_(MFOLD) of each oligonucleotide probe should begreater than or equal to the assigned filter (ΔG_(MFOLD) limit) to begiven a filter score of “1”; otherwise, the filter score is “0”. Thefilter scores are added. Furthermore, one can also impose a secondfilter for this parameter; that is, that the ΔG_(MFOLD) also has to beless than a defined upper limit. In accordance with the above discussionstability subsequences are identified. This leads to another filter.Accordingly, filters are assigned to the stability sequence data. Thestability subsequence of each oligonucleotide probe needs to be greaterthan or equal to the assigned filter limit to be given a filter score of“1 ”; otherwise, the filter score is “0”. In addition, one can alsoimpose a second filter for this parameter; that is, that the stabilitysubsequence also has to be less than a defined upper limit. In allcases, the filter values are determined by objective optimization(algorithmic or graphical) of the predictions of the present methodversus training data, as described previously.

[0195] On the basis of the above filter sets a subset ofoligonucleotides within said predetermined number of uniqueoligonucleotides is identified. Oligonucleotides in the subset areidentified that are clustered along a region of the nucleotide sequencethat is hybridizable to the target nucleotide sequence. The resultingnumber of oligonucleotide probe regions is examined. The above filtersmay then be loosened or tightened by changing the filter limits toobtain more or fewer clusters of oligonucleotides to match the goal,which is set by the needs of the investigator. For instance, aparticular application might require that the investigator design 5non-overlapping probes that efficiently hybridize to a given targetsequence.

[0196] As mentioned above, the contigs may be selected on the basis ofcontig length. In another approach, the scores defined above may besummed for cluster size determination. To this end the probe score ofthe particular filter set (e.g., T_(m) probe limit, ΔG_(MFOLD) limit andstability sequence limit) is calculated for each oligonucleotide probe.The probe score is the sum of the filter scores. Thus, the probe scoreis 0 if no parameters pass their respective filters. The probe score is1, 2 or 3 if one, two or three parameters, respectively, pass theirfilters for that oligonucleotide probe. This summing is continued foreach parameter that is in the current filter set of the algorithm used.For a given algorithm a minimum probe score limit is set. In the currentexample this limit will be at least 1 and could be 2 or 3 depending onthe needs of the investigator, the number of probe clusters required andthe results of objective optimizations of algorithm performance againsttraining data. The probe score is compared to this probe score limit. Ifthe probe score of oligonucleotide probe i is greater than or equal tothe probe score limit, then oligonucleotide probe i is assigned a scorepassed value of 1. Next, a window is chosen for the evaluation ofclustering (the “cluster window”). This will be the next filter applied.The cluster window (“w”) smoothes the score passed values by summing thevalues in a window w nucleotides long, centered about position i. Theresulting sum is called the cluster sum. Usually, the cluster window isan odd integer, usually 7 or 9 nucleotides. The cluster sum values arethen filtered, by comparing to a user-set threshold, cluster filter. Ifcluster sum is greater than or equal to cluster filter, this filter ispassed, and the probe is predicted to hybridize efficiently to itstarget.

[0197] This window summing procedure converts the score for the passedvalue for each oligonucleotide into a consensus metric for a set of wadjacent probes. A “consensus metric” is a measurement that distills anumber of values into one consensus value. In this case, the consensusvalue is calculated by simply summing the individual values. The windowsumming procedure therefore evaluates a property similar to the contiglength metric discussed above. However, the summed score has theadvantage of allowing for a few probes within a cluster to have notpassed their individual probe score limits. We have found that thisallows more observed hybridization peaks to be predicted.

[0198] It may be desired in some circumstances to combine the results ofmultiple algorithm versions. We refer to this operation as “tiling”.This may be explained more fully as follows. Tiling generally involvesjoining together the predicted oligonucleotide probe sets identified bymultiple algorithm versions. In the context of the present invention,tiling multiple algorithm versions involves forming the union ofmultiple sets of predictions. These predictions may arise from differentembodiments of the present invention. Alternatively, the different setsof predictions may arise from the same embodiment, but different filtersets. The different filter sets may additionally be restricted todifferent combinations of parameter values. For instance, one filter setmight be used when the predicted duplex melting temperature T_(m) isgreater than or equal to some value, while another might be used whenT_(m) is below that value.

[0199] An example of the logical endpoint of tiling multiple filter setsacross different regions of the possible combinations of predictiveparameters and then forming the union of the resulting predictions isthe contour plot shown in FIG. 3, with the associated rule that “thevalue of the normalized hybridization intensity associated with aparticular combination of (T_(m)−T_(hyb)) and ΔG_(MFOLD) must be greaterthan or equal to some threshold value.” In this case, the contour at thethreshold value becomes the filter. This contour and its interior can bethought of as the union of many small rectangular regions (“tiles”),each of which is bracketed by low and high cutoff values for each of theparameters.

[0200] The predictions of different algorithm versions can also becombined by forming the intersection of two or more differentpredictions. The reliability of predictions within such intersectionsets is enhanced because such sets are, by definition, insensitive tochanges in the details of the predictive algorithm. Intersection is auseful method for reducing the number of predicted probes when a singlealgorithm version produces too many candidate probes for efficientexperimental evaluation.

[0201] The most specific oligonucleotide probe set (i.e., the set leastlikely to include poor probes) will be the intersection set frommultiple algorithms. Clusters that have overlapping oligonucleotideprobes from multiple algorithms constitute the intersection set ofoligonucleotide probes. The oligonucleotide probe that is in the centerof an intersection cluster is chosen. This central oligonucleotide probemay have the highest probability of predicting a peak or, in otherwords, of binding well to the target nucleotide sequence.Oligonucleotide probes on either side of center, which are still withinthe intersection cluster, may also be selected. The distance of these“side” oligonucleotide probes from the center generally will be shorteror longer depending upon the length of the cluster.

[0202] The most sensitive set of oligonucleotide probes (i.e., the setmost likely to include at least one good probe) is generally the unionset from multiple algorithms. Clusters that are predicted by at leastone type of algorithm constitute the union set of oligonucleotideprobes. The oligonucleotide probe in the center of a union cluster ischosen. Oligonucleotide probes on either side of center, which are stillwithin the union cluster, usually are also chosen. The distance of theseside probes from the center will be shorter or longer depending upon thelength of the cluster. In summary, the combination of using thestability subsequence parameter, tiling multiple filter sets, and makingunion and intersection cluster sets of oligonucleotide probes exhibitsvery high sensitivity and specificity in predicting oligonucleotideprobes that effectively hybridize to a target nucleotide sequence ofinterest.

[0203] Another aspect of the present invention is a computer basedmethod for predicting the potential of an oligonucleotide to hybridizeto a target nucleotide sequence. A predetermined number of uniqueoligonucleotides within a nucleotide sequence that is hybridizable withthe target nucleotide sequence is identified under computer control. Theoligonucleotides are chosen to sample the entire length of thenucleotide sequence. A value is determined and evaluated under computercontrol for each of the oligonucleotides for at least one parameter thatis independently predictive of the ability of each of theoligonucleotides to hybridize to the target nucleotide sequence. Theparameter values are stored. Based on the examination of the storedparameter values, a subset of oligonucleotides within the predeterminednumber of unique oligonucleotides is identified under computer control.Then, oligonucleotides in the subset that are clustered along a regionof the nucleotide sequence that is hybridizable to the target nucleotidesequence are identified under computer control.

[0204] A computer program is utilized to carry out the above methodsteps. The computer program provides for input of a target-hybridizableor target-complementary nucleotide sequence, efficient algorithms forcomputation of oligonucleotide sequences and their associated predictiveparameters, efficient, versatile mechanisms for filtering sets ofoligonucleotide sequences based on parameter values, mechanisms forcomputation of the size of clusters of oligonucleotide sequences thatpass multiple filters, and mechanisms for outputting the finalpredictions of the method of the present invention in a versatile,machine-readable or human-readable form.

[0205] Another aspect of the present invention is a computer system forconducting a method for predicting the potential of an oligonucleotideto hybridize to a target nucleotide sequence. An input means forintroducing a target nucleotide sequence into the computer system isprovided. The input means may permit manual input of the targetnucleotide sequence. The input means may also be a database or astandard format file such as GenBank. Also included in the system ismeans for determining a number of unique oligonucleotide sequences thatare within a nucleotide sequence that is hybridizable with the targetnucleotide sequence. The oligonucleotide sequences is chosen to samplethe entire length of the nucleotide sequence. Suitable means is acomputer program or software, which also provides memory means forstoring the oligonucleotide sequences. The system also includes meansfor controlling the computer system to carry out a determination andevaluation for each of the oligonucleotide sequences a value for atleast one parameter that is independently predictive of the ability ofeach of the oligonucleotide sequences to hybridize to the targetnucleotide sequence. Suitable means is a computer program or softwaresuch as, for example, Microsoft® Excel spreadsheet, Microsoft® Accessrelational database or the like, which also provides memory means forstoring the parameter values. The system further comprises means forcontrolling the computer to carry out an identification of a subset ofoligonucleotide sequences within the number of unique oligonucleotidesequences based on the automated examination of the stored parametervalues. Suitable means is a computer program or software, which alsoallocates memory means for storing the subset of oligonucleotides. Thesystem also includes means for controlling the computer to carry out anidentification of oligonucleotide sequences in the subset that areclustered along a region of the nucleotide sequence that is hybridizableto the target nucleotide sequence. Suitable means is a computer programor software, which also allocates memory means for storing theoligonucleotide sequences in the subset. The computer system alsoincludes means for outputting data relating to the oligonucleotidesequences in the subset. Such means may be machine readable or humanreadable and may be software that communicates with a printer,electronic mail, another computer program, and the like. Oneparticularly attractive feature of the present invention is that theoutputting means may communicate directly with software that is part ofan oligonucleotide synthesizer. In this way the results of the method ofthe present invention may be used directly to provide instruction forthe synthesis of the desired oligonucleotides.

[0206] Another advantage of the present invention is that it may be usedto predict efficient hybridization oligonucleotides for each of multipletarget sequences. Thus, very large arrays may be constructed and testedwith minimal synthesis of oligonucleotides.

EXAMPLES

[0207] The invention is demonstrated further by the followingillustrative examples. Parts and percentages are by weight unlessotherwise indicated. Temperatures are in degrees Centigrade (°C.) unlessotherwise specified. The following preparations and examples illustratethe invention but are not intended to limit its scope. All reagents usedherein were from Amresco, Inc., Solon, Ohio (buffers), PharmaciaBiotech, Piscataway, N.J. (nucleoside triphosphates) or Promega,Madison, Wisconsin (RNA polymerases) unless indicated otherwise.

Example 1

[0208] Synopsis: Data from labeled RNA target hybridizations tosurface-bound DNA probes directed against 4 different gene sequenceswere compared to the predictions of the preferred version of theprediction algorithm illustrated by the flow chart in FIG. 2. The RNAtargets were sequences derived from the human immunodeficiency virusprotease-reverse transcriptase region (HIV PRT; sense-strand targetpolynucleotide), human glyceraldehyde-3-phosphate dehydrogenase gene(G3PDH; antisense-strand target polynucleotide), human tumor suppressorp53 gene (p53; antisense-strand target polynucleotide) and rabbitβ-globin gene (β-globin; antisense-strand target polynucleotide). TheGenBank accession numbers for the gene sequences, number of data pointscollected and temperature of hybridization have all been previouslylisted in Table 2.

[0209] Materials and Methods: Three different experimental systems andtwo different labeling schemes were used to collect data.

[0210] The sequence and hybridization data for β-globin were taken fromthe literature (see Milner et al., (1997), supra; in this experiment,³²P-radiolabeled RNA target was used.

[0211] The hybridization data for HIV PRT were obtained using anAffymetrix GeneChip™ HIV PRT-sense probe array (i.e. sense strand targetpolynucleotide) (GeneChip™ HIV PRT 440s, Affymetrix Corporation, SantaClara, Calif.) as specified by the manufacturer, except that thefluorescein-labeled RNA target was not fragmented prior to hybridizationand that hybridization was performed for 24 hours. The concentration offluorescein-labeled RNA used was 26.3 nM; label density wasapproximately 18 fluoresceinated uridyl nucleotides per 1 kilobase (kb)RNA transcript. The raw data were collected by scanning the array with aGeneChip™ Scanner 50 (Affymetrix Corporation, Santa Clara, Calif.), asspecified by the manufacturer. The raw data were reduced to afeature-averaged (“.CEL”) file, using the GeneChip™ software suppliedwith the scanner. Finally, a table of hybridization intensities forperfect-complement 20-mer probes was constructed using the ASCII featuremap file supplied with the GeneChip™ software to connect probe sequencesto measured hybridization intensities. The resulting data set containeddata for every overlapping 20-mer probe to the target sequence.

[0212] The data for G3PDH and p53 were measured using 93-feature arraysconstructed using commercially available streptavidin-coated microtiterplates (Pierce Chemical Company, Rockford, Ill.). Every tenth possible25-mer probe complementary to each target was synthesized and3′-biotinylated by a contract synthesis vendor (Operon, Inc., Alameda,Calif.). The 3′-linked biotin was used to anchor individual probes tomicrotiter wells, via the well known, strong affinity of streptavidinfor biotin. Biotinylated DNA probes were resuspended to a concentrationof 10 μM in hybridization buffer (5× sodium chloride-sodiumphosphate-disodium ethylenediaminetetraacetate (SSPE), 0.05% TritonX-100, filter-sterilized; 1× SSPE is 150 mM sodium chloride, 10 mMsodium phosphate, 1 mM disodium ethylenediaminetetraacetate (EDTA), pH7.4). Individual probes were diluted 1:10 in hybridization buffer intospecified wells (100 μl total volume per well) of a streptavidin-coatedmicrotiter plate; probes were allowed to bind to the covered platesovernight at 35° C. The other 3 wells of the 96-well microtiter platewere probe-less controls. The coated plates were washed with 3×200 μl ofwash buffer (6× SSPE, 0.005% Triton X-100, filter-sterilized).Fluorescein-labeled RNA (100 μl of a 10 nM solution in hybridizationbuffer) was added to each well. The plates were covered and hybridizedat 35° C. for 20-24 hours. The hybridized plates were washed with 3×200μl of wash buffer. Label was then released in each well by adding 100 μlof 20 μg/ml RNAase I (Sigma Chemical Company, St. Louis, Mo.) inTris-EDTA (TE) (10 mM Tris(hydroxymethyl)aminomethane (Tris), 1 mM EDTA,pH 8.0, sterile) and incubating at 35° C. for at least 30 minutes. Thefluorescence released from the surface of each well was quantitated witha PerSeptive Biosystems Cytofluor II microtiter plate fluorimeter(PerSeptive Biosystems, Inc., Framingham, Mass.) using themanufacturer's recommended excitation and emission filter sets forfluorescein. Each plate hybridization was performed in quadruplicate,and the data for each probe were averaged to obtain the hybridizationintensity.

[0213] Labeled RNA targets specific for G3PDH and p53 were produced viaT7 RNA polymerase transcription of DNA templates in the presence offluorescein-UTP (Boehringer Mannheim Corporation, Indianapolis, Ind.),using the same method as that outlined by Affymetrix for their GeneChip™HIV PRT sense probe array. The DNA template for G3PDH was purchased froma commercial source (Clontech, Inc., Palo Alto, Calif.). The DNAtemplate for p53 was obtained by sub-cloning a PCR fragment from anATCC-derived reference clone (No. 57254) of human p53 into thecommercially-available PCR cloning vector pCR2.1-TOPO (Invitrogen, Inc.,Carlsbad, Calif.), then linearizing the plasmid at the end of thepolycloning site opposite the vector-derived T7 promoter.

[0214] Probe predictions were performed using a software application(referred to as “p5”) that was built atop Microsoft's Access relationaldatabase application, using added Visual Basic modules, the TrueDB GridPro 5.0 (Apex Software Corporation, Pittsburgh, Pa.) enhancement toVisual Basic, and a version of the FORTRAN application MFOLD, modifiedto run in a Windows NT 4.0 environment, as an ActiveX control. TheVisual Basic source code for the p5 software application is found in theMicrofiche appendix to this specification. The DNA target sequencecomplements that were input into p5 for division into potentialoligonucleotide probe sequences are listed below:

[0215] Parent Sequence Accession No.: K03256

[0216] Locus: BUNGLOB.DNA (portion of rabbit β-globin)

[0217] Length: 122 1 TTCTTCCACA TTCACCTTGC CCCACAGGGC AGTGACCGCAGACTTCTCCT CACTGGACAG SEQ ID NO:36 61 ATGCACCATT CTGTCTGTTT TGGGGGATTGCAAGTAAACA CAGTTGTGTC AAAAGCAAGT 121 GT

[0218] Parent Sequence Accession No.: M15654

[0219] Locus: HIV_PRTA.S (HIV PRT antisense; parses into probes specificfor sense-strand target)

[0220] Length: 1040 1 TGTACTGTCC ATTTATCAGG ATGGAGTTCA TAACCCATCCAAAGGAATCG AGGTTCTTTC SEQ ID NO:37 61 TGATGTTTTT TGTCTGGTGT GGTAAGTCCCCACCTCAACA GATGTTGTCT CAGCTCCTCT 121 ATTTTTGTTC TATGCTGCCC TATTTCTAAGTCAGATCCTA CATACAAATC ATCCATGTAT 181 TGATAGATAA CTATGTCTGG ATTTTGTTTTTTAAAAGGCT CTAAGATTTT TGTCATGCTA 241 CTTTGGAATA TTGCTGGTGA TCCTTTCCATCCCTGTGGAA GCACATTGTA CTGATATCTA 301 ATCCCTGGTG TCTCATTGTT TATACTAGGTATGGTAAATG CAGTATACTT CCTGAAGTCT 361 TCATCTAAGG GAACTGAAAA ATATGCATCACCCACATCCA GTACTGTTAC TGATTTTTTC 421 TTTTTTAACC CTGCGGGATG TGGTATTCCTAATTGAACTT CCCAGAAGTC TTGAGTTCTC 481 TTATTAAGTT CTCTGAAATC TACTAATTTTCTCCATTTAG TACTGTCTTT TTTCTTTATG 541 GCAAATACTG GAGTATTGTA TGGATTCTCAGGCCCAATTT TTGAAATTTT CCCTTCCTTT 601 TCCATTTCTG TACAAATTTC TACTAATGCTTTTATTTTTT CTTCTGTCAA TGGCCATTGT 661 TTAACTTTTG GGCCATCCAT TCCTGGCTTTAATTTTACTG GTACAGTCTC AATAGGGCTA 721 ATGGGAAAAT TTAAAGTGCA ACCAATCTGAGTCAACAGAT TTCTTCCAAT TATGTTGACA 781 GGTGTAGGTC CTACTAATAC TGTACCTATAGCTTTATGTC CACAGATTTC TATGAGTATC 841 TGATCATACT GTCTTACTTT GATAAAACCTCCAATTCCCC CTATCATTTT TGGTTTCCAT 901 CTTCCTGGCA AACTCATTTC TTCTAATACTGTATCATCTG CTCCTGTATC TAATAGAGCT 961 TCCTTTAGTT GCCCCCCTAT CTTTATTGTGACGAGGGGTC GTTGCCAAAG AGTGATCTGA 1021 GGGAAGTTAA AGGATACAGT

[0221] Parent Sequence Accession No.: X01677

[0222] Locus: G3PDH (Clontech G3PDH template—parses into probes specificfor antisense-strand target)

[0223] Length: 999 1 GAAGGTCGGA GTCAACGGAT TTGGTCGTAT TGGGCGCCTGGTCACCAGGG CTGCTTTTAA SEQ ID NO:38 61 CTCTGGTAAA GTGGATATTG TTGCCATCAATGACCCCTTC ATTGACCTCA ACTACATGGT 121 TTACATGTTC CAATATGATT CCACCCATGGCAAATTCCAT GGCACCGTCA AGGCTGAGAA 181 CGGGAAGCTT GTCATCAATG GAAATCCCATCACCATCTTC CAGGAGCGAG ATCCCTCCAA 241 AATCAAGTGG GGCGATGCTG GCGCTGAGTACGTCGTGGAG TCCACTGGCG TCTTCACCAC 301 CATGGAGAAG GCTGGGGCTC ATTTGCAGGGGGGAGCCAAA AGGGTCATCA TCTCTGCCCC 361 CTCTGCTGAT GCCCCCATGT TCGTCATGGGTGTGAACCAT GAGAAGTATG ACAACAGCCT 421 CAAGATCATC AGCAATGCCT CCTGCACCACCAACTGCTTA GCACCCCTGG CCAAGGTCAT 481 CCATGACAAC TTTGGTATCG TGGAAGGACTCATGACCACA GTCCATGCCA TCACTGCCAC 541 CCAGAAGACT GTGGATGGCC CCTCCGGGAAACTGTGGCGT GATGGCCGCG GGGCTCTCCA 601 GAACATCATC CCTGCCTCTA CTGGCGCTGCCAAGGCTGTG GGCAAGGTCA TCCCTGAGCT 661 AGACGGGAAG CTCACTGGCA TGGCCTTCCGTGTCCCCACT GCCAACGTGT CAGTGGTGGA 721 CCTGACCTGC CGTCTAGAAA AACCTGCCAAATATGATGAC ATCAAGAAGG TGGTGAAGCA 781 GGCGTCGGAG GGCCCCCTCA AAGGCATCCTGGGCTACACT GAGCACCAGG TGGTCTCCTC 841 TGACTTCAAC AGCGACACCC ACTCCTCCACCTTTGACGCT GGGGCTGGCA TTGCCCTCAA 901 CGACCACTTT GTCAAGCTCA TTTCCTGGTATGACAACGAA TTTGGCTACA GCAACAGGGT 961 GGTGGACCTC ATGGCCCACA TGCTATAGTGAGTCGTATT

[0224] Parent Sequence Accession No.: X54156

[0225] Locus: HSP53PCRa (p53 template—parses into probes specific forantisense-strand target)

[0226] Length: 1049 1 GAGGTGCGTG TTTGTGCCTG TCCTGGGAGA GACCGGCGCACAGAGGAAGA GAATCTCCGC SEQ ID NO:39 61 AAGAAAGGGG AGCCTCACCA CGAGCTGCCCCCAGGGAGCA CTAAGCGAGC ACTGCCCAAC 121 AACACCAGCT CCTCTCCCCA GCCAAAGAAGAAACCACTGG ATGGAGAATA TTTCACCCTT 181 CAGATCCGTG GGCGTGAGCG CTTCGAGATGTTCCGAGAGC TGAATGAGGC CTTGGAACTC 241 AAGGATGCCC AGGCTGGGAA GGAGCCAGGGGGGAGCAGGG CTCACTCCAG CCACCTGAAG 301 TCCAAAAAGG GTCAGTCTAC CTCCCGCCATAAAAAACTCA TGTTCAAGAC AGAAGGGCCT 361 GACTCAGACT GACATTCTCC ACTTCTTGTTCCCCACTGAC AGCCTCCCTC CCCCATCTCT 421 CCCTCCCCTG CCATTTTGGG TTTTGGGTCTTTGAACCCTT GCTTGCAATA GGTGTGCGTC 481 AGAAGCACCC AGGACTTCCA TTTGCTTTGTCCCGGGGCTC CACTGAACAA GTTGGCCTGC 541 ACTGGTGTTT TGTTGTGGGG AGGAGGATGGGGAGTAGGAC ATACCAGCTT AGATTTTAAG 601 GTTTTTACTG TGAGGGATGT TTGGGAGATGTAAGAAATGT TCTTGCAGTT AAGGGTTAGT 661 TTACAATCAG CCACATTCTA GGTAGGTAGGGGCCCACTTC AGCGTACTAA CCAGGGAAGC 721 TGTCCCTCAT GTTGAATTTT CTCTAACTTCAAGGCCCATA TCTGTGAAAT GCTGGCATTT 781 GCACCTACCT CACAGAGTGC ATTGTGAGGGTTAATGAAAT AATGTACATC TGGCCTTGAA 841 ACCACCTTTT ATTACATGGG GTCTAAAACTTGACCCCCTT GAGGGTGCCT GTTCCCTCTC 901 CCTCTCCCTG TTGGCTGGTG GGTTGGTAGTTTCTACAGTT GGGCAGCTGG TTAGGTAGAG 961 GGAGTTGTCA AGTCTTGCTG GCCCAGCCAAACCCTGTCTG ACAACCTCTT GGTCGACCTT 1021 AGTACCTAAA AGGAAATCTC ACCCCATCC

[0227] The sequences indicated above, which are complements of thetarget sequences, were divided into overlapping oligonucleotidesequences with one nucleotide between starting positions. Theoligonucleotide sequence lengths were 17 (rabbit β-globin), 20 (HIV PRT)or 25 (G3PDH; p53). The oligonucleotide sequence lengths were dictatedby the probe lengths used in the experiments to which the predictionswere compared. The RNA target concentrations used to calculate predictedRNA/DNA duplex melting temperatures were 100 pM (rabbit β-globin), 26.3nM (HIV PRT) and 10 nM (G3PDH; p53). These were also dictated byexperimental conditions for the comparison data. The cut-off filter usedfor the predicted free energy of the most stable probe sequenceintramolecular structure, ΔG_(MFOLD), was${\Delta \quad G_{MFOLD}} \geq {{- 0.4}{\frac{kcal}{mole}.}}$

[0228] The filter condition used for the predicted RNA/DNA duplexmelting temperature was

25° C.≦T _(m)+16.6 log([Na ⁺])−T _(hyb)≦50° C.,

[0229] where T_(m) is the target concentration-dependent value of thepredicted RNA/DNA duplex melting temperature before correction for saltconcentration, the term “16.6 log([Na⁺])” corrects the meltingtemperature for salt effects, and T_(hyb) is the hybridizationtemperature. The values of the salt correction term and T_(hyb) havealready been listed in Table 2. For convenient use within p5, the abovecondition was algebraically rearranged into the equivalent form

25° C. −16.6 log([Na ⁺])+T _(hyb) ≦T _(m)≦50° C.−16.6 log([Na⁺])+T_(hyb).

[0230] Clusters were ranked according to the number of contiguousoligonucleotide sequences that passed through the filter set (“contig”length).

[0231] Results: The detailed analysis results for rabbit β-globin arepresented in Table 3; a graphical summary of the results is shown inFIG. 4. In Table 3, values of T_(m) and ΔG_(MFOLD) that were excluded bythe filter set are shown with a line through them, and table entries forcontig length are shown in gray when the oligonucleotide sequence inquestion was not in a contig. The top 20% of the observed hybridizationintensities are shown underlined. TABLE 3 Oligonucleotide SEQ IDΔG_(MFOLD) Contig Hybridization Intensity Position Sequence NO: T_(m) (°C.) (kcal/mole) Length (Milner et al., 1997) 1 TTCTTCCACATTCACCT 40

5.00 100 2 TCTTCCACATTCACCTT 41

5.00 130 3 CTTCCACATTCACCTTG 42

0.90 130 4 TTCCACATTCACCTTGC 43

0.50 200 5 TCCACATTCACCTTGCC 44 58.46 0.50 7 120 6 CCACATTCACCTTGCCC 4561.10 0.50 7 180 7 CACATTCACCTTGCCCC 46 61.10 0.50 7 230 8ACATTCACCTTGCCCCA 47 61.10 0.50 7 220 9 CATTCACCTTGCCCCAC 48 61.10 0.907 320 10 ATTCACCTTGCCCCACA 49 61.10 0.70 7 310 11 TTCACCTTGCCCCACAG 5061.33 0.50 7 320 12 TCACCTTGCCCCACAGG 51 63.70

390 13 CACCTTGCCCCACAGGG 52 64.85

410 14 ACCTTGCCCCACAGGGC 53 68.01

240 15 CCTTGCCCCACAGGGCA 54 68.63

 50 16 CTTGCCCCACAGGGCAG 55 64.95

 20 17 TTGCCCCACAGGGCAGT 56 66.31

 20 18 TGCCCCACAGGGCAGTG 57 65.79

 20 19 GCCCCACAGGGCAGTGA 58 67.37

 20 20 CCCCACAGGGCAGTGAC 59 63.42

 40 21 CCCACAGGGCAGTGACC 60 63.42

 20 22 CCACAGGGCAGTGACCG 61 59.85

 20 23 CACAGGGCAGTGACCGC 62 60.14

 20 24 ACAGGGCAGTCACCGCA 63 60.14

 20 25 CAGGGCAGTGACCGCAG 64 59.76

 30 26 AGGGCAGTGACCGCAGA 65 59.83

 20 27 GGGCAGTGACCGCAGAC 66 60.22

 30 28 GGCAGTGACCGCAGACT 67 59.53

 30 29 GCAGTGACCGCAGACTT 68 57.06

 30 30 CAGTGACCGCAGACTTC 69

 40 31 AGTGACCGCAGACTTCT 70

−0.20    40 32 GTGACCGCAGACTTCTC 71 55.99 0.60 7 100 33TGACCGCAGACTTCTCC 72 57.01 0.60 7 120 34 GACCGCAGACTTCTCCT 73 59.22 0.607 180 35 ACCGCAGACTTCTCCTC 74 59.28 0.60 7 210 36 CCGCAGACTTCTCCTCA 7560.07 0.60 7 200 37 CGCAGACTTCTCCTCAC 76 56.34 0.60 7 190 38GCAGACTTCTCCTCACT 77 57.79 0.60 7 240 39 CAGACTTCTCCTCACTG 78

0.60 240 40 AGACTTCTCCTCACTGG 79

0.00 340 41 GACTTCTCCTCACTGGA 80 55.77

340 42 ACTTCTCCTCACTGGAC 81

240 43 CTTCTCCTCACTGGACA 82 55.75

240 44 TTCTCCTCACTGGACAG 83

120 45 TCTCCTCACTGGACAGA 84

100 46 CTCCTCACTGGACAGAT 85

110 47 TCCTCACTGGACAGATG 86

 80 48 CCTCACTGGACAGATGC 87

0.00 240 49 CTCACTGGACAGATGCA 88

0.20  90 50 TCACTGGACAGATGCAC 89

0.20  30 51 CACTGGACAGATGCACC 90

0.50 100 52 ACTGGACAGATGCACCA 91

 80 53 CTGGACAGATGCACCAT 92

 90 54 TGGACAGATGCACCATT 93

 80 55 GGACAGATGCACCATTC 94

0.30 180 56 GACAGATGCACCATTCT 95

−0.10   220 57 ACAGATGCACCATTCTG 96

120 58 CAGATGCACCATTCTGT 97

120 59 AGATGCACCATTCTGTC 98

−0.10   250 60 GATGCACCATTCTGTCT 99

0.30 520 61 ATGCACCATTCTGTCTG 100

0.40 980 62 TGCACCATTCTGTCTGT 101 56.05 0.20 2 780 63 GCACCATTCTGTCTGTT102 56.52 0.20 2 810 64 CACCATTCTGTCTGTTT 103

0.20 220 65 ACCATTCTGTCTGTTTT 104

0.20 120 66 CCATTCTGTCTGTTTTG 105

0.20 120 67 CATTCTGTCTGTTTTGG 106

0.60 160 68 ATTCTGTCTGTTTTGGG 107

1.70 310 69 TTCTGTCTGTTTTGGGG 108

1.70 250 70 TCTGTCTGTTTTGGGGG 109 55.90 1.70 2  80 71 CTGTCTGTTTTGGGGGA110 55.91 1.40 2  30 72 TGTCTGTTTTGGGGGAT 111

0.90  50 73 GTCTGTTTTGGGGGATT 112

0.90  10 74 TCTGTTTTGGGGGATTG 113

1.10  10 75 CTGTTTTGGGGGATTGC 114

2.20  10 76 TGTTTTGGGGGATTGCA 115

1.20  10 77 GTTTTGGGGGATTGCAA 116

0.00  5 78 TTTTGGGGGATTGCAAG 117

−0.20    5 79 TTTGGGGGATTGCAAGT 118

−0.20    5 80 TTGGGGGATTGCAAGTA 119

0.00  5 81 TGGGGGATTGCAAGTAA 120

1.20  5 82 GGGGGATTGCAAGTAAA 121

1.40  5 83 GGGGATTGCAAGTAAAC 122

1.40  5 84 GGGATTGCAAGTAAACA 123

1.30  5 85 GGATTGCAAGTAAACAC 124

0.90  5 86 GATTGCAAGTAAACACA 125

0.50  5 87 ATTGCAAGTAAACACAG 126

0.50  5 88 TTGCAAGTAAACACAGT 127

0.50  5 89 TGCAAGTAAACACAGTT 128

0.30  5 90 GCAAGTAAACACAGTTG 129

0.10  10 91 CAAGTAAACACAGTTGT 130

−0.30    5 92 AAGTAAACACAGTTGTG 131

 5 93 AGTAAACACAGTTGTGT 132

 5 94 GTAAACACAGTTGTGTC 133

 5 95 TAAACACAGTTGTGTCA 134

 5 96 AAACACAGTTGTGTCAA 135

 5 97 AACACAGTTGTGTCAAA 136

 5 98 ACACAGTTGTGTCAAAA 137

 10 99 CACAGTTGTGTCAAAAG 138

 15 100 ACAGTTGTGTCAAAAGC 139

 30 101 CAGTTGTGTCAAAAGCA 140

0.20  25 102 AGTTGTGTCAAAAGCAA 141

−0.10    25 103 GTTGTGTCAAAAGCAAG 142

−0.30    20 104 TTGTGTCAAAAGCAAGT 143

−0.10   120 105 TGTGTCAAAAGCAAGTG 144

0.50  20

[0232] In FIG. 4, the hybridization intensity observed experimentally isplotted as a function of oligonucleotide starting position in thetarget-complementary sequence that was input into p5. The identifiedcontigs are plotted as horizontal bars, with the contig rank (by length)shown in parentheses next to each bar. It is clear from Table 3 and FIG.4 that the prediction algorithm identified contigs that overlap all ofthe “top 20%” hybridization intensity peaks observed. Iterativeexperimental improvement of these predictions would converge on each ofthe observed intensity maxima in 3-4 iterations.

[0233] Prediction worksheets for HIV PRT, G3PDH and p53 were prepared ina manner similar to that for rabbit β-globin as shown in Table 3, exceptthat the probes were longer as indicated above and that approximately1,000 probes were analyzed for each of these genes. The results of theseanalyses are shown in FIG. 5 (HIV PRT), FIG. 6 (G3PDH) and FIG. 7 (p53).In FIG. 5, data are plotted for all possible 20-mer oligonucleotideprobes. In FIGS. 6 and 7, data were available for only every 10^(th)25-mer probe, and the actual data points are plotted as open diamonds.

[0234] It is clear from FIGS. 5-7 that the hybridization efficiencyprediction algorithm of the present invention performed well in the taskof identifying regions with observed high hybridization intensity. Ineach case, the 4 longest contigs point to good-to-excellent regions forexperimental investigation. It should be noted that the contigs usuallybracket observed intensity peaks; experimental iterative refinementwould therefore be expected to converge in 2-3 iterations. By this ismeant that certain oligonucleotides from the identified contigs areprepared and subjected to evaluation in actual hybridizationexperiments. Based on the results of such experiments, the observedsignal is evaluated to determine whether the oligonucleotides arehybridizing to the left of, the right of, or on the center of a peakwith respect to the graphed data. The next iteration is carried out toexperimentally evaluate the hybridization efficiency of probes that areinferred to lie closer to the peak of hybridization efficiency, based onthe data from the previous iteration. Iteration is continued until thesignal level is deemed acceptable by the user, or the localhybridization efficiency maximum is reached (i.e. the best probe in thecluster identified by the method of the current invention has beenexperimentally identified). A detailed illustration of this process isshown in Example 3.

[0235] It should be noted that clusters of predictions that overlap themaxima of observed peaks of hybridization efficiency will often yielduser-acceptable probes on the first iteration. Thus, the method of thepresent invention is much more efficient than current methods in whichevery potential probe is synthesized. For instance, in the HIV PRTexample shown in FIG. 5, at least 3 good probes would be identifiedafter synthesis of ˜10 test probes (i.e. statistical sampling of the 3longest contigs). This is much more efficient than the ˜1,000 probesrepresented by the data in FIG. 5.

Example 2

[0236] Synopsis: Data from a labeled RNA target hybridization to anAffymetrix GeneChip™ HIV PRT-sense probe array (GeneChip™ HIV PRT 440s,Affymetrix Corporation, Santa Clara, Calif.) were compared to thepredictions of the window-averaged composite dimensionless score versionof the method of the present invention.

[0237] Materials and Methods: Data were obtained as described for theAffymetrix GeneChip™ HIV PRT-sense probe array (GeneChip™ HIV PRT 440s,Affymetrix Corporation, Santa Clara, Calif.) in Example 1. The DNAsequence (SEQ ID NO: 37) complementary to the fluorescein-labeled RNAtarget was divided into overlapping 20-mer oligonucleotide sequencesspaced one nucleotide apart, using the prototype application p5; p5 wasalso used to calculate the predicted values of the RNA/DNA heteroduplexmelting temperature (T_(m)) and the free energy of the most stablepredicted probe intramolecular structure, ΔG_(MFOLD), as described inExample 1. The probe sequences and parameter values were thentransferred to a Microsoft Excel spreadsheet, which was used to completethe predictions of efficient and inefficient probes. The weight wasobtained by optimizing the performance of the algorithm with the data ofMilner et al., supra, as the training data using the Microsoft® Excel®spreadsheet software. The composite score was calculated using a weightof 0.62 for the dimensionless T_(m) score and a weight of 0.38 for theΔG_(MFOLD) dimensionless score. The windowed-averaging was performedusing a window width of 7 and Microsofte Excel® spreadsheet software.Finally, the oligonucleotide sequences having the top 10% of thewindow-averaged composite dimensionless scores were predicted to beefficient probes, while the oligonucleotide sequences having the bottom10% of the window-averaged composite dimensionless scores were predictedto be inefficient probes.

[0238] Results: The calculated parameters and scores are shown in Table4; the algorithm predictions are also shown diagrammatically in FIG. 8.In Table 4, window-averaged composite score values that were in the top10% of the distribution of values are shown in bold type, values thatwere in the bottom 10% are shown in italics, and all other values areshown with a line through them. It is clear from both Table 4 and FIG. 8that the window-averaged composite dimensionless score embodiment of thecurrent invention correctly predicted both efficient and inefficienthybridization probes for HIV PRT sense-strand RNA. As in Example 1,statistical sampling of contiguous stretches of predicted “good” probeswould lead to convergence of the design process to the best probes ineach region in 24 design iterations. TABLE 4 Window- SEQ ΔG_(MFOLD)Averaged HIV PRT p5 Probe ID RNA/DNA (kcal/mole T_(m) ΔG_(MFOLD)Composite Composite GeneChip ™ Position DNA Probe Sequence NO: T_(m) (°C.) @ 35° C.) Score Score Score Score Data 1 GTACTGTCCATTTATCAGGA 14564.16 −0.10 0.557 −0.199 0.269 1152.2 2 TACTGTCCATTTATCAGGAT 146 60.91−0.40 0.080 −0.460 −0.125 1040.7 3 ACTGTCCATTTATCAGGATG 147 61.41 −0.900.152 −0.895 −0.246 291.9 4 CTGTCCATTTATCAGGATGG 148 63.46 −0.90 0.453−0.895 −0.059

221.8 5 TGTCCATTTATCAGGATGGA 149 62.82 −0.90 0.360 −0.895 −0.117

148.3 6 GTCCATTTATCAGGATGGAG 150 63.15 −1.90 0.408 −1.764 −0.418

84.6 7 TCCATTTATCAGGATGGAGT 151 63.15 −2.10 0.408 −1.938 −0.484

128.7 8 CCATTTATCAGGATGGAGTT 152 62.03 −1.90 0.245 −1.764 −0.519

94.6 9 CATTTATCAGGATGGAGTTC 153 59.53 −0.60 −0.122 −0.634 −0.317

157.5 10 ATTTATCAGGATGGAGTTCA 154 59.53 0.80 −0.122 0.583 0.146

316.9 11 TTTATCAGGATGGAGTTCAT 155 59.53 0.40 −0.122 0.236 0.014

360.2 12 TTATCAGGATGGAGTTCATA 156 58.58 0.40 −0.262 0.236 −0.073

403.8 13 TATCAGGATGGAGTTCATAA 157 56.21 0.20 −0.609 0.062 −0.354

382.5 14 ATCAGGATGGAGTTCATAAC 158 57.34 0.20 −0.444 0.062 −0.252

324.4 15 TCAGGATGGAGTTCATAACC 159 61.25 0.20 0.129 0.062 0.104

320.5 16 CAGGATGGAGTTCATAACCC 160 63.57 0.20 0.470 0.062 0.315

238.9 17 AGGATGGAGTTCATAACCCA 161 63.57 −0.10 0.470 −0.199 0.216

202.3 18 GGATGGAGTTCATAACCCAT 162 63.34 −1.30 0.436 −1.243 −0.202

113.6 19 GATGGAGTTCATAACCCATC 163 62.24 −2.00 0.275 −1.851 −0.533

97.7 20 ATGGAGTTCATAACCCATCC 164 64.62 −3.30 0.624 −2.982 −0.746

143.3 21 TGGAGTTCATAACCCATCCC 165 68.18 −2.00 1.146 −1.851 0.007

484.6 22 GGAGTTCATAACCCATCCCA 166 69.39 −1.60 1.324 −1.504 0.249

857.6 23 GAGTTCATAACCCATCCCAA 167 64.93 −0.20 0.670 −0.286 0.307

991.4 24 AGTTCATAACCCATCCCAAA 168 61.82 0.20 0.213 0.062 0.155

907.0 25 GTTCATAACCCATCCCAAAG 169 61.82 0.20 0.213 0.062 0.155

887.9 26 TTCATAACCCATCCCAAAGG 170 61.36 0.60 0.145 0.410 0.246

1015.3 27 TCATAACCCATCCCAAAGGA 171 62.21 −0.10 0.270 −0.199 0.092

279.7 28 CATAACCCATCCCAAAGGAA 172 59.26 −0.30 −0.163 −0.373 −0.243

210.7 29 ATAACCCATCCCAAAGGAAT 173 58.19 −0.30 −0.320 −0.373 −0.340

179.9 30 TAACCCATCCCAAAGGAATG 174 58.13 −0.30 −0.328 −0.373 −0.345

91.8 31 AACCCATCCCAAAGGAATGG 175 60.78 −1.30 0.061 −1.243 −0.435

44.6 32 ACCCATCCCAAAGGAATGGA 176 63.69 −2.00 0.487 −1.851 −0.401

42.9 33 CCCATCCCAAAGGAATGGAG 177 63.40 −2.20 0.445 −2.025 −0.494

45.0 34 CCATCCCAAAGGAATGGAGG 178 62.34 −2.30 0.290 −2.112 −0.623

45.3 35 CATCCCAAAGGAATGGAGGT 179 61.72 −2.60 0.199 −2.373 −0.778

47.9 36 ATCCCAAAGGAATGGAGGTT 180 60.90 −2.20 0.079 −2.025 −0.721

49.2 37 TCCCAAAGGAATGGAGGTTC 181 62.24 −2.20 0.274 −2.025 −0.600

74.2 38 CCCAAAGGAATGGAGGTTCT 182 62.71 −2.00 0.344 −1.851 −0.490

125.5 39 CCAAAGGAATGGAGGTTCTT 183 59.47 −0.70 −0.132 −0.721 −0.356

183.3 40 CAAAGGAATGGAGGTTCTTT 184 56.10 −0.30 −0.627 −0.373 −0.530

261.4 41 AAAGGAATGGAGGTTCTTTC 185 56.11 −0.30 −0.625 −0.373 −0.529

518.3 42 AAGGAATGGAGGTTCTTTCT 186 60.05 −0.30 −0.046 −0.373 −0.170

716.5 43 AGGAATGGAGGTTCTTTCTG 187 62.09 −0.30 0.253 −0.373 0.015

1056.0 44 GGAATGGAGGTTCTTTCTGA 188 63.23 −0.30 0.420 −0.373 0.119

1084.3 45 GAATGGAGGTTCTTTCTGAT 189 60.56 0.10 0.028 −0.025 0.008

1241.1 46 AATGGAGGTTCTTTCTGATG 190 59.12 0.30 −0.183 0.149 −0.057

1278.8 47 ATGGAGGTTCTTTCTGATGT 191 64.58 0.30 0.618 0.149 0.440

1616.0 48 TGGAGGTTCTTTCTGATGTT 192 64.98 0.30 0.677 0.149 0.476

1677.5 49 GGAGGTTCTTTCTGATGTTT 193 65.49 0.30 0.751 0.149 0.522

1963.1 50 GAGGTTCTTTCTGATGTTTT 194 63.04 0.30 0.392 0.149 0.300

2126.1 51 AGGTTCTTTCTGATGTTTTT 195 61.97 0.30 0.235 0.149 0.202

2143.3 52 GGTTCTTTCTGATGTTTTTT 196 62.11 0.30 0.256 0.149 0.215

3540.6 53 GTTCTTTCTGATGTTTTTTG 197 59.21 0.30 −0.170 0.149 −0.049

1728.7 54 TTCTTTCTGATGTTTTTTGT 198 59.21 0.30 −0.170 0.149 −0.049

1364.3 55 TCTTTCTGATGTTTTTTGTC 199 60.35 0.50 −0.002 0.323 0.121

1788.4 56 CTTTCTGATGTTTTTTGTCT 200 60.96 1.20 0.086 0.931 0.407

2670.9 57 TTTCTGATGTTTTTTGTCTG 201 58.76 1.20 −0.235 0.931 0.208

3336.2 58 TTCTGATGTTTTTTGTCTGG 202 61.17 1.20 0.118 0.931 0.427

6683.6 59 TCTGATGTTTTTTGTCTGGT 203 64.20 1.20 0.562 0.931 0.702

10227.0 60 CTGATGTTTTTTGTCTGGTG 204 62.51 1.20 0.315 0.931 0.549

10965.0 61 TGATGTTTTTTGTCTGGTGT 205 63.80 1.20 0.504 0.931 0.666

11133.0 62 GATGTTTTTTGTCTGGTGTG 206 63.80 1.60 0.504 1.279 0.798 0.89411503.0 63 ATGTTTTTTGTCTGGTGTGG 207 65.18 1.90 0.705 1.540 1.023 0.8949492.8 64 TGTTTTTTGTCTGGTGTGGT 208 68.78 1.70 1.234 1.366 1.284 0.91410704.0 65 GTTTTTTGTCTGGTGTGGTA 209 68.28 1.70 1.161 1.366 1.239 0.93310741.0 66 TTTTTTGTCTGGTGTGGTAA 210 62.37 1.70 0.294 1.366 0.701 0.9509187.5 67 TTTTTGTCTGGTGTGGTAAG 211 62.23 1.70 0.273 1.366 0.689 0.9417871.0 68 TTTTGTCTGGTGTGGTAAGT 212 65.28 1.20 0.721 0.931 0.801 0.9217209.1 69 TTTGTCTGGTGTGGTAAGTC 213 66.56 1.20 0.908 0.931 0.917 0.9598052.3 70 TTGTCTGGTGTGGTAAGTCC 214 70.25 0.30 1.449 0.149 0.955 1.0227230.6 71 TGTCTGGTGTGGTAAGTCCC 215 73.77 −0.10 1.966 −0.199 1.143 0.9986809.5 72 GTCTGGTGTGGTAAGTCCCC 216 77.74 −0.10 2.549 −0.199 1.504 0.9137442.8 73 TCTGGTGTGGTAAGTCCCCA 217 75.28 −0.50 2.187 −0.547 1.148

2627.7 74 CTGGTGTGGTAAGTCCCCAC 218 74.18 −2.10 2.026 −1.938 0.519

1315.0 75 TGGTGTGGTAAGTCCCCACC 219 75.80 −3.50 2.263 −3.156 0.204

4182.3 76 GGTGTGGTAAGTCCCCACCT 220 77.89 −3.80 2.571 −3.417 0.296

474.7 77 GTGTGGTAAGTCCCCACCTC 221 77.05 −2.50 2.448 −2.286 0.649

682.4 78 TGTGGTAAGTCCCCACCTCA 222 74.71 −2.50 2.105 −2.286 0.436

679.1 79 GTGGTAAGTCCCCACCTCAA 223 72.54 −2.10 1.785 −1.938 0.370

924.0 80 TGGTAAGTCCCCACCTCAAC 224 69.94 −0.90 1.404 −0.895 0.531

835.5 81 GGTAAGTCCCCACCTCAACA 225 71.14 −0.50 1.580 −0.547 0.772

1213.6 82 GTAAGTCCCCACCTCAACAG 226 68.97 0.90 1.262 0.670 1.037

1106.1 83 TAAGTCCCCACCTCAACAGA 227 67.18 0.90 0.999 0.670 0.874 0.8721009.0 84 AAGTCCCCACCTCAACAGAT 228 67.68 0.50 1.073 0.323 0.788 0.9081656.2 85 AGTCCCCACCTCAACAGATG 229 69.68 0.50 1.366 0.323 0.970

2178.3 86 GTCCCCACCTCAACAGATGT 230 72.56 0.20 1.789 0.062 1.132

2567.0 87 TCCCCACCTCAACAGATGTT 231 69.77 −0.10 1.379 −0.199 0.779

3000.5 88 CCCCACCTCAACAGATGTTG 232 68.19 −1.30 1.148 −1.243 0.240

2025.4 89 CCCACCTCAACAGATGTTGT 233 67.78 −2.00 1.087 −1.851 −0.030

429.2 90 CCACCTCAACAGATGTTGTC 234 65.65 −2.00 0.775 −1.851 −0.223

157.9 91 CACCTCAACAGATGTTGTCT 235 63.85 −2.00 0.511 −1.851 −0.387

135.3 92 ACCTCAACAGATGTTGTCTC 236 64.11 −2.00 0.549 −1.851 −0.363

330.8 93 CCTCAACAGATGTTGTCTCA 237 64.77 −2.00 0.646 −1.851 −0.303

900.0 94 CTCAACAGATGTTGTCTCAG 238 61.08 −2.00 0.104 −1.851 −0.639

1177.0 95 TCAACAGATGTTGTCTCAGC 239 63.40 −2.00 0.444 −1.851 −0.428

795.1 96 CAACAGATGTTGTCTCAGCT 240 63.91 −1.60 0.520 −1.504 −0.249

889.2 97 AACAGATGTTGTCTCAGCTC 241 64.19 −0.10 0.560 −0.199 0.272

1703.6 98 ACAGATGTTGTCTCAGCTCC 242 70.61 0.00 1.503 −0.112 0.889

3115.2 99 CAGATGTTGTCTCAGCTCCT 243 72.08 0.00 1.719 −0.112 1.023 0.8474445.0 100 AGATGTTGTCTCAGCTCCTC 244 72.66 0.20 1.803 0.062 1.141 1.0706762.8 101 GATGTTGTCTCAGCTCCTCT 245 74.49 0.90 2.071 0.670 1.539 1.2278845.0 102 ATGTTGTCTCAGCTCCTCTA 246 72.38 0.80 1.763 0.583 1.314 1.2539010.6 103 TGTTGTCTCAGCTCCTCTAT 247 72.38 0.80 1.763 0.583 1.314 1.26019941.0 104 GTTGTCTCAGCTCCTCTATT 248 72.97 0.80 1.849 0.583 1.368 1.25712577.0 105 TTGTCTCAGCTCCTCTATTT 249 69.70 0.80 1.369 0.583 1.071 1.1497503.3 106 TGTCTCAGCTCCTCTATTTT 250 69.70 0.80 1.369 0.583 1.071 1.0987033.8 107 GTCTCAGCTCCTCTATTTTT 251 70.26 0.80 1.451 0.583 1.121 1.0248276.7 108 TCTCAGCTCCTCTATTTTTG 252 66.57 0.80 0.910 0.583 0.786 0.9422899.0 109 CTCAGCTCCTCTATTTTTGT 253 68.39 0.80 1.177 0.583 0.952 0.9232935.0 110 TCAGCTCCTCTATTTTTGTT 254 66.69 0.80 0.927 0.583 0.796 0.9301512.8 111 CAGCTCCTCTATTTTTGTTC 255 66.69 0.80 0.927 0.583 0.796 0.8721708.8 112 AGCTCCTCTATTTTTGTTCT 256 67.52 1.00 1.050 0.757 0.939 0.8331977.3 113 GCTCCTCTATTTTTGTTCTA 257 66.63 1.80 0.919 1.453 1.122

2114.8 114 CTCCTCTATTTTTGTTCTAT 258 62.13 1.80 0.259 1.453 0.713

1527.3 115 TCCTCTATTTTTGTTCTATG 259 59.97 1.80 −0.058 1.453 0.516

1536.8 116 CCTCTATTTTTGTTCTATGC 260 62.84 1.80 0.363 1.453 0.777

1824.5 117 CTCTATTTTTGTTCTATGCT 261 60.87 1.50 0.074 1.192 0.499

1169.2 118 TCTATTTTTGTTCTATGCTG 262 58.71 1.50 −0.244 1.192 0.302

683.7 119 CTATTTTTGTTCTATGCTGC 263 61.60 1.50 0.181 1.192 0.565

1306.8 120 TATTTTTGTTCTATGCTGCC 264 63.53 1.50 0.464 1.192 0.741

2523.6 121 ATTTTTGTTCTATGCTGCCC 265 67.96 1.50 1.113 1.192 1.143 0.9316682.0 122 TTTTTGTTCTATGCTGCCCT 266 69.96 1.50 1.407 1.192 1.325 1.0609417.4 123 TTTTGTTCTATGCTGCCCTA 267 69.01 1.50 1.267 1.192 1.239 1.15110339.0 124 TTTGTTCTATGCTGCCCTAT 268 68.62 1.50 1.210 1.192 1.203 1.25410750.0 125 TTGTTCTATGCTGCCCTATT 269 68.62 1.50 1.210 1.192 1.203 1.28211180.0 126 TGTTCTATGCTGCCCTATTT 270 68.62 1.50 1.210 1.192 1.203 1.27111060.0 127 GTTCTATGCTGCCCTATTTC 271 70.37 1.80 1.468 1.453 1.462 1.22116074.0 128 TTCTATGCTGCCCTATTTCT 272 69.00 1.80 1.266 1.453 1.337 1.1449183.8 129 TCTATGCTGCCCTATTTCTA 273 68.05 1.80 1.127 1.453 1.251 1.0828617.8 130 CTATGCTGCCCTATTTCTAA 274 64.38 1.70 0.589 1.366 0.884 1.0407286.8 131 TATGCTGCCCTATTTCTAAG 275 62.71 1.50 0.344 1.192 0.666 0.9783642.4 132 ATGCTGCCCTATTTCTAAGT 276 66.39 0.80 0.883 0.583 0.769 0.8833799.7 133 TGCTGCCCTATTTCTAAGTC 277 67.95 0.80 1.112 0.583 0.911

3408.3 134 GCTGCCCTATTTCTAAGTCA 278 69.25 0.80 1.303 0.583 1.030

4017.4 135 CTGCCCTATTTCTAAGTCAG 279 65.26 0.80 0.718 0.583 0.667

2197.2 136 TGCCCTATTTCTAAGTCAGA 280 64.63 −0.10 0.626 −0.199 0.312

1125.0 137 GCCCTATTTCTAAGTCAGAT 281 64.73 −0.60 0.639 −0.634 0.156

1306.3 138 CCCTATTTCTAAGTCAGATC 282 61.98 −0.60 0.236 −0.634 −0.094

1019.5 139 CCTATTTCTAAGTCAGATCC 283 61.98 −0.60 0.236 −0.634 −0.094

1852.3 140 CTATTTCTAAGTCAGATCCT 284 60.05 −0.60 −0.046 −0.634 −0.270

3159.3 141 TATTTCTAAGTCAGATCCTA 285 57.43 −0.60 −0.430 −0.634 −0.508

2604.8 142 ATTTCTAAGTCAGATCCTAC 286 58.59 −0.60 −0.261 −0.634 −0.402

3986.1 143 TTTCTAAGTCAGATCCTACA 287 59.91 −0.60 −0.068 −0.634 −0.283

4500.7 144 TTCTAAGTCAGATCCTACAT 288 59.55 −0.60 −0.120 −0.634 −0.315

4754.5 145 TCTAAGTCAGATCCTACATA 289 58.62 −0.40 −0.257 −0.460 −0.334

3802.1 146 CTAAGTCAGATCCTACATAC 290 57.80 1.20 −0.377 0.931 0.120

5069.4 147 TAAGTCAGATCCTACATACA 291 57.13 1.30 −0.476 1.018 0.092

3965.2 148 AAGTCAGATCCTACATACAA 292 55.78 1.30 −0.673 1.018 −0.030

3862.3 149 AGTCAGATCCTACATACAAA 293 55.78 1.30 −0.673 1.018 −0.030

2868.9 150 GTCAGATCCTACATACAAAT 294 55.62 1.70 −0.697 1.366 0.087

3542.9 151 TCAGATCCTACATACAAATC 295 54.02 1.50 −0.932 1.192 −0.125

2477.1 152 CAGATCCTACATACAAATCA 296 54.07 1.10 −0.924 0.844 −0.252

2522.4 153 AGATCCTACATACAAATCAT 297 52.83 1.10 −1.106 0.844 −0.365

2554.6 154 GATCCTACATACAAATCATC 298 53.87 1.50 −0.953 1.192 −0.138

3580.0 155 ATCCTACATACAAATCATCC 299 56.33 1.80 −0.591 1.453 0.185

5937.7 156 TCCTACATACAAATCATCCA 300 57.54 1.80 −0.415 1.453 0.295

4606.7 157 CCTACATACAAATCATCCAT 301 56.32 1.80 −0.594 1.453 0.184

4877.2 158 CTACATACAAATCATCCATG 302 52.68 1.10 −1.128 0.844 −0.379

2608.6 159 TACATACAAATCATCCATGT 303 53.56 0.30 −0.999 0.149 −0.563

1491.7 160 ACATACAAATCATCCATGTA 304 53.56 −0.10 −0.999 −0.199 −0.695

1364.3 161 CATACAAATCATCCATGTAT 305 53.07 −0.80 −1.071 −0.808 −0.971−0.751 1089.8 162 ATACAAATCATCCATGTATT 306 52.11 −1.10 −1.211 −1.069−1.157 −0.818 1008.6 163 TACAAATCATCCATGTATTG 307 52.08 −0.40 −1.215−0.460 −0.928 −0.891 624.8 164 ACAAATCATCCATGTATTGA 308 53.86 0.20−0.955 0.062 −0.568 −0.921 535.8 165 CAAATCATCCATGTATTGAT 309 53.36−0.50 −1.027 −0.547 −0.845 −0.860 3019.6 166 AAATCATCCATGTATTGATA 31051.57 −0.70 −1.291 −0.721 −1.074 −0.753 214.0 167 AATCATCCATGTATTGATAG311 53.47 −0.70 −1.012 −0.721 −0.901

212.7 168 ATCATCCATGTATTGATAGA 312 56.66 −0.50 −0.543 −0.547 −0.545

165.2 169 TCATCCATGTATTGATAGAT 313 56.66 −0.10 −0.543 −0.199 −0.412

166.0 170 CATCCATGTATTGATAGATA 314 54.80 0.30 −0.817 0.149 −0.450

151.0 171 ATCCATGTATTGATAGATAA 315 51.69 0.30 −1.273 0.149 −0.733

101.8 172 TCCATGTATTGATAGATAAC 316 52.19 0.30 −1.199 0.149 −0.687

84.0 173 CCATGTATTGATAGATAACT 317 52.89 0.30 −1.097 0.149 −0.623 −0.850130.3 174 CATGTATTGATAGATAACTA 318 48.47 0.70 −1.746 0.496 −0.894 −0.93767.8 175 ATGTATTGATAGATAACTAT 319 47.12 0.00 −1.944 −0.112 −1.248 −1.00665.7 176 TGTATTGATAGATAACTATG 320 47.11 −0.20 −1.945 −0.286 −1.315−1.048 90.0 177 GTATTGATAGATAACTATGT 321 49.90 −0.20 −1.536 −0.286−1.061 −1.099 125.9 178 TATTGATAGATAACTATGTC 322 48.24 −0.20 −1.779−0.286 −1.212 −1.083 132.6 179 ATTGATAGATAACTATGTCT 323 50.78 −0.20−1.407 −0.286 −0.981 −0.998 167.4 180 TTGATAGATAACTATGTCTG 324 50.75−0.20 −1.411 −0.286 −0.984 −0.916 219.0 181 TGATAGATAACTATGTCTGG 32553.01 −0.20 −1.080 −0.286 −0.778 −0.866 722.6 182 GATAGATAACTATGTCTGGA326 54.36 −0.20 −0.881 −0.286 −0.655 −0.774 825.1 183ATAGATAACTATGTCTGGAT 327 53.04 −0.10 −1.074 −0.199 −0.742

844.4 184 TAGATAACTATGTCTGGATT 328 53.37 −0.10 −1.027 −0.199 −0.712

912.6 185 AGATAACTATGTCTGGATTT 329 54.27 0.10 −0.895 −0.025 −0.565

1301.8 186 GATAACTATGTCTGGATTTT 330 54.43 0.80 −0.870 0.583 −0.318

1367.4 187 ATAACTATGTCTGGATTTTG 331 53.08 1.50 −1.070 1.192 −0.210

1284.2 188 TAACTATGTCTGGATTTTGT 332 56.05 1.50 −0.634 1.192 0.060

1162.5 189 AACTATGTCTGGATTTTGTT 333 56.97 1.50 −0.499 1.192 0.144

1396.7 190 ACTATGTCTGGATTTTGTTT 334 59.38 1.50 −0.145 1.192 0.363

1348.3 191 CTATGTCTGGATTTTGTTTT 335 59.16 1.50 −0.177 1.192 0.343

1092.8 192 TATGTCTGGATTTTGTTTTT 336 57.45 1.50 −0.428 1.192 0.188

912.6 193 ATGTCTGGATTTTGTTTTTT 337 58.41 1.70 −0.287 1.366 0.341

994.3 194 TGTCTGGATTTTGTTTTTTA 338 57.81 2.00 −0.375 1.627 0.386

840.7 195 GTCTGGATTTTGTTTTTTAA 339 55.82 1.00 −0.667 0.757 −0.126

941.9 196 TCTGGATTTTGTTTTTTAAA 340 50.98 0.80 −1.377 0.583 −0.632

84.9 197 CTGGATTTTGTTTTTTAAAA 341 48.16 0.30 −1.790 0.149 −1.054

78.6 198 TGGATTTTGTTTTTTAAAAG 342 46.41 0.10 −2.048 −0.025 −1.279 −0.85193.2 199 GGATTTTGTTTTTTAAAAGG 343 48.87 0.10 −1.686 −0.025 −1.055 −0.93356.0 200 GATTTTGTTTTTTAAAAGGC 344 50.22 0.10 −1.488 −0.025 −0.932 −0.91249.9 201 ATTTTGTTTTTTAAAAGGCT 345 50.84 0.10 −1.397 −0.025 −0.876 −0.84355.0 202 TTTTGTTTTTTAAAAGGCTC 346 52.03 0.30 −1.223 0.149 −0.702 −0.76864.6 203 TTTGTTTTTTAAAAGGCTCT 347 53.64 0.50 −0.987 0.323 −0.489

162.8 204 TTGTTTTTTAAAAGGCTCTA 348 52.76 0.50 −1.115 0.323 −0.569

265.8 205 TGTTTTTTAAAAGGCTCTAA 349 50.71 0.50 −1.417 0.323 −0.756

288.5 206 GTTTTTTAAAAGGCTCTAAG 350 50.86 0.50 −1.395 0.323 −0.742

548.4 207 TTTTTTAAAAGGCTCTAAGA 351 49.40 0.70 −1.609 0.496 −0.809

524.7 208 TTTTTAAAAGGCTCTAAGAT 352 49.11 1.20 −1.651 0.931 −0.670 −0.746937.9 209 TTTTAAAAGGCTCTAAGATT 353 49.11 1.20 −1.651 0.931 −0.670 −0.7901440.3 210 TTTAAAAGGCTCTAAGATTT 354 49.11 1.20 −1.651 0.931 −0.670−0.820 1633.3 211 TTAAAAGGCTCTAAGATTTT 355 49.11 0.50 −1.651 0.323−0.901 −0.735 1987.4 212 TAAAAGGCTCTAAGATTTTT 356 49.11 0.00 −1.651−0.112 −1.067

1792.3 213 AAAAGGCTCTAAGATTTTTG 357 49.63 0.20 −1.575 0.062 −0.953

2218.9 214 AAAGGCTCTAAGATTTTTGT 358 54.13 1.20 −0.914 0.931 −0.213

2371.4 215 AAGGCTCTAAGATTTTTGTC 359 57.38 1.20 −0.439 0.931 0.082

3308.9 216 AGGCTCTAAGATTTTTGTCA 360 60.78 0.80 0.061 0.583 0.260

4070.5 217 GGCTCTAAGATTTTTGTCAT 361 60.56 0.80 0.028 0.583 0.239

5394.5 218 GCTCTAAGATTTTTGTCATG 362 57.81 0.80 −0.376 0.583 −0.011

2025.5 219 CTCTAAGATTTTTGTCATGC 363 57.81 0.80 −0.376 0.583 −0.011

1741.9 220 TCTAAGATTTTTGTCATGCT 364 57.81 0.80 −0.376 0.583 −0.011

1707.6 221 CTAAGATTTTTGTCATGCTA 365 55.87 0.80 −0.660 0.583 −0.187

1783.0 222 TAAGATTTTTGTCATGCTAC 366 54.43 0.80 −0.872 0.583 −0.319

3131.4 223 AAGATTTTTGTCATGCTACT 367 56.99 0.60 −0.495 0.410 −0.151

4892.5 224 AGATTTTTGTCATGCTACTT 368 59.39 0.60 −0.144 0.410 0.067

5856.4 225 GATTTTTGTCATGCTACTTT 369 59.54 0.60 −0.122 0.410 0.080

6439.0 226 ATTTTTGTCATGCTACTTTG 370 58.09 0.60 −0.334 0.410 −0.051

5820.3 227 TTTTTGTCATGCTACTTTGG 371 60.78 0.60 0.060 0.410 0.193

5189.6 228 TTTTGTCATGCTACTTTGGA 372 61.79 0.60 0.209 0.410 0.285

4721.7 229 TTTGTCATGCTACTTTGGAA 373 59.35 0.60 −0.149 0.410 0.063

4221.0 230 TTGTCATGCTACTTTGGAAT 374 59.00 0.60 −0.200 0.410 0.032

4279.0 231 TGTCATGCTACTTTGGAATA 375 58.10 0.60 −0.333 0.410 −0.051

4102.0 232 GTCATGCTACTTTGGAATAT 376 58.16 0.90 −0.324 0.670 0.054

5069.8 233 TCATGCTACTTTGGAATATT 377 55.52 0.90 −0.711 0.670 −0.186

2407.9 234 CATGCTACTTTGGAATATTG 378 54.23 1.30 −0.900 1.018 −0.171

2443.0 235 ATGCTACTTTGGAATATTGC 379 56.90 1.40 −0.508 1.105 0.105

2324.3 236 TGCTACTTTGGAATATTGCT 380 58.82 0.90 −0.227 0.670 0.114

1894.1 237 GCTACTTTGGAATATTGCTG 381 58.82 1.30 −0.227 1.018 0.246

2363.8 238 CTACTTTGGAATATTGCTGG 382 57.35 1.70 −0.443 1.366 0.244

1363.0 239 TACTTTGGAATATTGCTGGT 383 58.39 1.70 −0.290 1.366 0.339

1217.5 240 ACTTTGGAATATTGCTGGTG 384 58.88 1.70 −0.217 1.366 0.384

1621.8 241 CTTTGGAATATTGCTGGTGA 385 59.64 1.70 −0.106 1.366 0.453

1438.2 242 TTTGGAATATTGCTGGTGAT 386 57.72 1.80 −0.388 1.453 0.311

1608.0 243 TTGGAATATTGCTGGTGATC 387 58.73 1.80 −0.241 1.453 0.403

2334.6 244 TGGAATATTGCTGGTGATCC 388 62.18 0.50 0.266 0.323 0.288

3776.7 245 GGAATATTGCTGGTGATCCT 389 64.19 −0.20 0.561 −0.286 0.239

5648.7 246 GAATATTGCTGGTGATCCTT 390 61.99 −0.20 0.238 −0.286 0.039

5358.8 247 AATATTGCTGGTGATCCTTT 391 61.03 −0.20 0.097 −0.286 −0.049

5517.2 248 ATATTGCTGGTGATCCTTTC 392 64.63 −0.20 0.625 −0.286 0.279

6246.4 249 TATTGCTGGTGATCCTTTCC 393 68.48 −0.20 1.190 −0.286 0.629

9975.1 250 ATTGCTGGTGATCCTTTCCA 394 70.22 −0.20 1.446 −0.286 0.788

11990.0 251 TTGCTGGTGATCCTTTCCAT 395 70.22 −0.60 1.446 −0.634 0.655

11543.0 252 TGCTGGTGATCCTTTCCATC 396 71.48 −0.60 1.631 −0.634 0.7700.862 14125.0 253 GCTGGTGATCCTTTCCATCC 397 75.32 −0.60 2.193 −0.6341.119 0.936 23489.0 254 CTGGTGATCCTTTCCATCCC 398 74.58 −0.60 2.085−0.634 1.052 1.022 15975.0 255 TGGTGATCCTTTCCATCCCT 399 74.58 −0.702.085 −0.721 1.019 1.082 16053.0 256 GGTGATCCTTTCCATCCCTG 400 74.58−0.30 2.085 −0.373 1.151 1.136 19205.0 257 GTGATCCTTTCCATCCCTGT 40175.40 0.20 2.206 0.062 1.391 1.080 17872.0 258 TGATCCTTTCCATCCCTGTG 40271.89 0.20 1.691 0.062 1.072 0.955 12871.0 259 GATCCTTTCCATCCCTGTGG 40374.58 −0.30 2.085 −0.373 1.151

8792.7 260 ATCCTTTCCATCCCTGTGGA 404 74.58 −1.60 2.085 −1.504 0.721

5609.6 261 TCCTTTCCATCCCTGTGGAA 405 72.27 −2.60 1.746 −2.373 0.181

3018.0 262 CCTTTCCATCCCTGTGGAAG 406 71.00 −2.80 1.559 −2.547 −0.001

1802.6 263 CTTTCCATCCCTGTGGAAGC 407 71.60 −2.80 1.648 −2.547 0.054

1074.0 264 TTTCCATCCCTGTGGAAGCA 408 70.81 −2.80 1.532 −2.547 −0.018

1132.5 265 TTCCATCCCTGTGGAAGCAC 409 71.02 −2.60 1.562 −2.373 0.067

1454.5 266 TCCATCCCTGTGGAAGCACA 410 71.74 −1.70 1.669 −1.591 0.430

1676.8 267 CCATCCCTGTGGAAGCACAT 411 70.20 −2.20 1.443 −2.025 0.125

2268.9 268 CATCCCTGTGGAAGCACATT 412 67.07 −2.20 0.983 −2.025 −0.160

1682.6 269 ATCCCTGTGGAAGCACATTG 413 65.82 −2.20 0.801 −2.025 −0.273

1753.9 270 TCCCTGTGGAAGCACATTGT 414 68.98 −2.20 1.263 −2.025 0.014

1281.8 271 CCCTGTGGAAGCACATTGTA 415 66.92 −2.20 0.962 −2.025 −0.173

1227.8 272 CCTGTGGAAGCACATTGTAC 416 63.84 −2.20 0.509 −2.025 −0.454

700.3 273 CTGTGGAAGCACATTGTACT 417 62.01 −2.20 0.241 −2.025 −0.620

618.7 274 TGTGGAAGCACATTGTACTG 418 59.99 −2.00 −0.056 −1.851 −0.738

771.5 275 GTGGAAGCACATTGTACTGA 419 61.39 −0.50 0.149 −0.547 −0.115

1180.6 276 TGGAAGCACATTGTACTGAT 420 58.35 0.50 −0.296 0.323 −0.061

1160.5 277 GGAAGCACATTGTACTGATA 421 57.86 0.50 −0.368 0.323 −0.106

1314.7 278 GAAGCACATTGTACTGATAT 422 55.32 0.50 −0.740 0.323 −0.336

1102.5 279 AAGCACATTGTACTGATATC 423 55.30 0.50 −0.744 0.323 −0.339

1222.1 280 AGCACATTGTACTGATATCT 424 59.26 0.50 −0.162 0.323 0.022

1893.2 281 GCACATTGTACTGATATCTA 425 58.48 0.50 −0.277 0.323 −0.049

2097.7 282 CACATTGTACTGATATCTAA 426 52.51 0.50 −1.152 0.323 −0.592

1237.8 283 ACATTGTACTGATATCTAAT 427 51.20 0.50 −1.345 0.323 −0.711

959.5 284 CATTGTACTGATATCTAATC 428 51.89 0.10 −1.244 −0.025 −0.781

1149.1 285 ATTGTACTGATATCTAATCC 429 54.53 −0.30 −0.856 −0.373 −0.672

2351.3 286 TTGTACTGATATCTAATCCC 430 58.41 −0.30 −0.287 −0.373 −0.320

4191.6 287 TGTACTGATATCTAATCCCT 431 59.99 −0.30 −0.055 −0.373 −0.176

5565.8 288 GTACTGATATCTAATCCCTG 432 59.99 −0.30 −0.055 −0.373 −0.176

9980.2 289 TACTGATATCTAATCCCTGG 433 59.52 −0.30 −0.124 −0.373 −0.218

6318.9 290 ACTGATATCTAATCCCTGGT 434 63.07 −0.30 0.397 −0.373 0.104

7749.5 291 CTGATATCTAATCCCTGGTG 435 62.43 −0.30 0.303 −0.373 0.046

8165.3 292 TGATATCTAATCCCTGGTGT 436 63.60 −0.30 0.474 −0.373 0.152

9107.6 293 GATATCTAATCCCTGGTGTC 437 65.19 0.10 0.707 −0.025 0.429

13914.0 294 ATATCTAATCCCTGGTGTCT 438 65.82 1.50 0.800 1.192 0.949

15093.0 295 TATCTAATCCCTGGTGTCTC 439 67.41 1.50 1.033 1.192 1.093

18647.0 296 ATCTAATCCCTGGTGTCTCA 440 69.20 1.30 1.296 1.018 1.190 0.90421810.0 297 TCTAATCCCTGGTGTCTCAT 441 69.20 0.80 1.296 0.583 1.025 0.99620102.0 298 CTAATCCCTGGTGTCTCATT 442 67.98 0.80 1.117 0.583 0.914 1.05220967.0 299 TAATCCCTGGTGTCTCATTG 443 65.90 0.80 0.811 0.583 0.725 1.09218200.0 300 AATCCCTGGTGTCTCATTGT 444 69.78 0.80 1.380 0.583 1.077 1.08819845.0 301 ATCCCTGGTGTCTCATTGTT 445 72.61 0.80 1.797 0.583 1.336 1.05719231.0 302 TCCCTGGTGTCTCATTGTTT 446 73.04 0.80 1.860 0.583 1.375 0.98117629.0 303 CCCTGGTGTCTCATTGTTTA 447 70.72 0.80 1.519 0.583 1.164 0.91817009.0 304 CCTGGTGTCTCATTGTTTAT 448 66.82 0.80 0.946 0.583 0.808

11580.0 305 CTGGTGTCTCATTGTTTATA 449 62.17 0.80 0.264 0.583 0.386

8374.6 306 TGGTGTCTCATTGTTTATAC 450 60.65 0.90 0.042 0.670 0.281

6153.3 307 GGTGTCTCATTGTTTATACT 451 62.88 0.20 0.369 0.062 0.252

7134.0 308 GTGTCTCATTGTTTATACTA 452 59.43 0.20 −0.138 0.062 −0.062

4435.2 309 TGTCTCATTGTTTATACTAG 453 56.35 0.20 −0.589 0.062 −0.342

2035.5 310 GTCTCATTGTTTATACTAGG 454 59.21 0.20 −0.170 0.062 −0.082

2466.6 311 TCTCATTGTTTATACTAGGT 455 59.21 0.20 −0.170 0.062 −0.082

1080.9 312 CTCATTGTTTATACTAGGTA 456 57.15 0.20 −0.472 0.062 −0.269

956.0 313 TCATTGTTTATACTAGGTAT 457 55.08 0.20 −0.776 0.062 −0.458

529.4 314 CATTGTTTATACTAGGTATG 458 53.70 0.20 −0.978 0.062 −0.583

471.4 315 ATTGTTTATACTAGGTATGG 459 55.01 0.20 −0.785 0.062 −0.463

510.4 316 TTGTTTATACTAGGTATGGT 460 58.17 0.20 −0.322 0.062 −0.176

531.0 317 TGTTTATACTAGGTATGGTA 461 57.21 0.20 −0.463 0.062 −0.264

613.3 318 GTTTATACTAGGTATGGTAA 462 55.23 0.00 −0.753 −0.112 −0.510

685.1 319 TTTATACTAGGTATGGTAAA 463 50.42 0.00 −1.459 −0.112 −0.947

300.0 320 TTATACTAGGTATGGTAAAT 464 50.12 0.00 −1.504 −0.112 −0.975

316.1 321 TATACTAGGTATGGTAAATG 465 49.79 0.00 −1.551 −0.112 −1.004

387.5 322 ATACTAGGTATGGTAAATGC 466 54.30 0.00 −0.889 −0.112 −0.594

685.7 323 TACTAGGTATGGTAAATGCA 467 55.59 0.20 −0.700 0.062 −0.411

759.6 324 ACTAGGTATGGTAAATGCAG 468 56.32 0.80 −0.593 0.583 −0.146

1050.2 325 CTAGGTATGGTAAATGCAGT 469 58.78 1.10 −0.232 0.844 0.177

1020.4 326 TAGGTATGGTAAATGCAGTA 470 56.24 1.10 −0.605 0.844 −0.054

742.6 327 AGGTATGGTAAATGCAGTAT 471 56.81 1.10 −0.521 0.844 −0.002

889.6 328 GGTATGGTAAATGCAGTATA 472 56.07 1.10 −0.631 0.844 −0.070

858.8 329 GTATGGTAAATGCAGTATAC 473 54.02 1.10 −0.931 0.844 −0.256

379.0 330 TATGGTAAATGCAGTATACT 474 53.06 0.40 −1.071 0.236 −0.575

166.7 331 ATGGTAAATGCAGTATACTT 475 53.94 0.40 −0.943 0.236 −0.495

215.3 332 TGGTAAATGCAGTATACTTC 476 55.21 0.40 −0.757 0.236 −0.380

103.2 333 GGTAAATGCAGTATACTTCC 477 59.15 0.40 −0.178 0.236 −0.021

246.3 334 GTAAATGCAGTATACTTCCT 478 58.53 0.80 −0.269 0.583 0.055

163.4 335 TAAATGCAGTATACTTCCTG 479 55.54 0.10 −0.708 −0.025 −0.448

294.1 336 AAATGCAGTATACTTCCTGA 480 57.36 −0.30 −0.441 −0.373 −0.415

531.4 337 AATGCAGTATACTTCCTGAA 481 57.36 −0.30 −0.441 −0.373 −0.415

1995.5 338 ATGCAGTATACTTCCTGAAG 482 59.50 −0.30 −0.128 −0.373 −0.221

510.1 339 TGCAGTATACTTCCTGAAGT 483 62.63 −0.90 0.332 −0.895 −0.134

555.4 340 GCAGTATACTTCCTGAAGTC 484 64.24 −1.10 0.568 −1.069 −0.054

1214.0 341 CAGTATACTTCCTGAAGTCT 485 61.94 −1.10 0.230 −1.069 −0.263

825.7 342 AGTATACTTCCTGAAGTCTT 486 61.00 −1.10 0.094 −1.069 −0.348

1582.6 343 GTATACTTCCTGAAGTCTTC 487 62.28 −1.10 0.281 −1.069 −0.232

2391.8 344 TATACTTCCTGAAGTCTTCA 488 60.34 −1.10 −0.004 −1.069 −0.409

2276.3 345 ATACTTCCTGAAGTCTTCAT 489 60.91 −1.20 0.080 −1.156 −0.389

2702.8 346 TACTTCCTGAAGTCTTCATC 490 62.40 −1.20 0.299 −1.156 −0.254

3781.7 347 ACTTCCTGAAGTCTTCATCT 491 65.05 −1.20 0.686 −1.156 −0.014

5343.4 348 CTTCCTGAAGTCTTCATCTA 492 63.86 −1.20 0.512 −1.156 −0.122

6309.0 349 TTCCTGAAGTCTTCATCTAA 493 59.70 −1.20 −0.098 −1.156 −0.500

6372.4 350 TCCTGAAGTCTTCATCTAAG 494 59.55 −1.20 −0.120 −1.156 −0.513

3835.3 351 CCTGAAGTCTTCATCTAAGG 495 60.76 −1.20 0.057 −1.156 −0.404

8925.5 352 CTGAAGTCTTCATCTAAGGG 496 59.48 −1.20 −0.130 −1.156 −0.520

1211.8 353 TGAAGTCTTCATCTAAGGGA 497 58.84 −1.00 −0.224 −0.982 −0.512

609.4 354 GAAGTCTTCATCTAAGGGAA 498 56.91 −0.10 −0.507 −0.199 −0.390

629.1 355 AAGTCTTCATCTAAGGGAAC 499 56.13 −0.10 −0.622 −0.199 −0.461

749.3 356 AGTCTTCATCTAAGGGAACT 500 60.12 −0.10 −0.036 −0.199 −0.098

805.6 357 GTCTTCATCTAAGGGAACTG 501 59.84 −0.10 −0.077 −0.199 −0.124

817.0 358 TCTTCATCTAAGGGAACTGA 502 58.11 −0.10 −0.331 −0.199 −0.281

327.1 359 CTTCATCTAAGGGAACTGAA 503 54.95 −0.60 −0.794 −0.634 −0.733

320.0 360 TTCATCTAAGGGAACTGAAA 504 51.39 −0.60 −1.316 −0.634 −1.057−0.822 84.1 361 TCATCTAAGGGAACTGAAAA 505 49.50 0.10 −1.595 −0.025 −0.998−1.002 67.7 362 CATCTAAGGGAACTGAAAAA 506 46.98 0.10 −1.963 −0.025 −1.227−1.171 62.2 363 ATCTAAGGGAACTGAAAAAT 507 45.78 0.10 −2.140 −0.025 −1.336−1.298 78.9 364 TCTAAGGGAACTGAAAAATA 508 45.27 0.10 −2.214 −0.025 −1.382−1.328 43.2 365 CTAAGGGAACTGAAAAATAT 509 44.36 0.10 −2.349 −0.025 −1.466−1.322 50.4 366 TAAGGGAACTGAAAAATATG 510 42.71 0.10 −2.591 −0.025 −1.616−1.242 43.7 367 AAGGGAACTGAAAAATATGC 511 46.54 0.10 −2.028 −0.025 −1.267−1.163 45.6 368 AGGGAACTGAAAAATATGCA 512 49.21 0.30 −1.637 0.149 −0.958−1.119 49.8 369 GGGAACTGAAAAATATGCAT 513 49.11 1.20 −1.651 0.931 −0.670−1.082 53.2 370 GGAACTGAAAAATATGCATC 514 47.87 1.20 −1.834 0.931 −0.783−0.958 56.6 371 GAACTGAAAAATATGCATCA 515 46.82 0.60 −1.987 0.410 −1.076−0.844 45.3 372 AACTGAAAAATATGCATCAC 516 46.12 0.40 −2.090 0.236 −1.206−0.773 56.3 373 ACTGAAAAATATGCATCACC 517 51.18 0.40 −1.347 0.236 −0.746

61.7 374 CTGAAAAATATGCATCACCC 518 54.20 0.40 −0.905 0.236 −0.471

224.5 375 TGAAAAATATGCATCACCCA 519 53.65 0.60 −0.985 0.410 −0.455

413.0 376 GAAAAATATGCATCACCCAC 520 54.14 1.30 −0.913 1.018 −0.179

1584.0 377 AAAAATATGCATCACCCACA 521 54.14 1.30 −0.913 1.018 −0.179

1846.7 378 AAAATATGCATCACCCACAT 522 55.78 1.10 −0.673 0.844 −0.096

2445.8 379 AAATATGCATCACCCACATC 523 58.72 0.90 −0.241 0.670 0.105

3709.4 380 AATATGCATCACCCACATCC 524 64.13 0.90 0.552 0.670 0.597

4548.4 381 ATATGCATCACCCACATCCA 525 67.27 0.90 1.013 0.670 0.883

5254.1 382 TATGCATCACCCACATCCAG 526 67.53 0.90 1.051 0.670 0.906 0.8645527.2 383 ATGCATCACCCACATCCAGT 527 71.21 0.90 1.590 0.670 1.241 0.9916916.9 384 TGCATCACCCACATCCAGTA 528 70.68 0.70 1.513 0.496 1.127 1.0305861.4 385 GCATCACCCACATCCAGTAC 529 71.39 0.70 1.617 0.496 1.191 1.0438078.4 386 CATCACCCACATCCAGTACT 530 69.16 0.70 1.290 0.496 0.988 1.0134148.8 387 ATCACCCACATCCAGTACTG 531 67.91 0.70 1.107 0.496 0.875 0.9133317.1 388 TCACCCACATCCAGTACTGT 532 71.15 0.10 1.582 −0.025 0.971

2486.4 389 CACCCACATCCAGTACTGTT 533 69.94 −0.40 1.404 −0.460 0.696

2746.4 390 ACCCACATCCAGTACTGTTA 534 68.25 −0.40 1.157 −0.460 0.543

2133.0 391 CCCACATCCAGTACTGTTAC 535 68.25 −0.40 1.157 −0.460 0.543

2197.0 392 CCACATCCAGTACTGTTACT 536 66.50 −0.40 0.900 −0.460 0.383

1824.0 393 CACATCCAGTACTGTTACTG 537 62.61 −1.90 0.329 −1.764 −0.467

1675.2 394 ACATCCAGTACTGTTACTGA 538 62.71 −2.30 0.344 −2.112 −0.590

1219.8 395 CATCCAGTACTGTTACTGAT 539 62.12 −2.30 0.258 −2.112 −0.643

1414.0 396 ATCCAGTACTGTTACTGATT 540 61.21 −2.30 0.124 −2.112 −0.726

1710.7 397 TCCAGTACTGTTACTGATTT 541 61.58 −2.30 0.178 −2.112 −0.692

2280.7 398 CCAGTACTGTTACTGATTTT 542 60.48 −2.30 0.017 −2.112 −0.792

2847.7 399 CAGTACTGTTACTGATTTTT 543 56.84 −1.90 −0.518 −1.764 −0.992

2830.2 400 AGTACTGTTACTGATTTTTT 544 55.82 −0.30 −0.666 −0.373 −0.555

4336.3 401 GTACTGTTACTGATTTTTTC 545 57.04 0.40 −0.488 0.236 −0.213

6581.1 402 TACTGTTACTGATTTTTTCT 546 55.95 −0.10 −0.649 −0.199 −0.478

5406.6 403 ACTGTTACTGATTTTTTCTT 547 56.89 −0.10 −0.510 −0.199 −0.392

6083.1 404 CTGTTACTGATTTTTTCTTT 548 56.67 −0.10 −0.542 −0.199 −0.412

6585.7 405 TGTTACTGATTTTTTCTTTT 549 54.96 −0.10 −0.793 −0.199 −0.567

3923.2 406 GTTACTGATTTTTTCTTTTT 550 55.36 −0.10 −0.734 −0.199 −0.531

4093.5 407 TTACTGATTTTTTCTTTTTT 551 52.62 −0.10 −1.136 −0.199 −0.780

1381.5 408 TACTGATTTTTTCTTTTTTA 552 51.70 −0.10 −1.272 −0.199 −0.864−0.784 1194.3 409 ACTGATTTTTTCTTTTTTAA 553 50.45 −0.10 −1.454 −0.199−0.977 −0.746 2371.3 410 CTGATTTTTTCTTTTTTAAC 554 50.45 −0.10 −1.454−0.199 −0.977

395.9 411 TGATTTTTTCTTTTTTAACC 555 52.50 −0.10 −1.155 −0.199 −0.792

230.7 412 GATTTTTTCTTTTTTAACCC 556 56.43 0.30 −0.578 0.149 −0.302

314.9 413 ATTTTTTCTTTTTTAACCCT 557 57.05 0.80 −0.487 0.583 −0.080

276.1 414 TTTTTTCTTTTTTAACCCTG 558 56.99 0.80 −0.495 0.583 −0.085

273.3 415 TTTTTCTTTTTTAACCCTGC 559 60.68 0.80 0.045 0.583 0.250

628.4 416 TTTTCTTTTTTAACCCTGCG 560 60.85 0.80 0.071 0.583 0.265

4661.4 417 TTTCTTTTTTAACCCTGCGG 561 62.93 0.70 0.377 0.496 0.422

411.2 418 TTCTTTTTTAACCCTGCGGG 562 65.01 −0.60 0.681 −0.634 0.181

289.5 419 TCTTTTTTAACCCTGCGGGA 563 65.91 −1.00 0.813 −0.982 0.131

244.8 420 CTTTTTTAACCCTGCGGGAT 564 64.52 −1.00 0.610 −0.982 0.005

250.7 421 TTTTTTAACCCTGCGGGATG 565 62.66 −1.00 0.337 −0.982 −0.164

207.8 422 TTTTTAACCCTGCGGGATGT 566 65.23 −1.00 0.713 −0.982 0.069

255.8 423 TTTTAACCCTGCGGGATGTG 567 64.80 −1.00 0.651 −0.982 0.030

356.8 424 TTTAACCCTGCGGGATGTGG 568 66.83 −1.00 0.949 −0.982 0.215

497.8 425 TTAACCCTGCGGGATGTGGT 569 69.50 −1.00 1.339 −0.982 0.457

754.3 426 TAACCCTGCGGGATGTGGTA 570 68.63 −1.00 1.212 −0.982 0.378

902.4 427 AACCCTGCGGGATGTGGTAT 571 69.14 −1.00 1.286 −0.982 0.424

1186.6 428 ACCCTGCGGGATGTGGTATT 572 71.66 −1.00 1.657 −0.982 0.654

1514.9 429 CCCTGCGGGATGTGGTATTC 573 72.66 −0.60 1.804 −0.634 0.878

2407.6 430 CCTGCGGGATGTGGTATTCC 574 72.66 −0.60 1.804 −0.634 0.878

3019.4 431 CTGCGGGATGTGGTATTCCT 575 71.02 −1.30 1.563 −1.243 0.497

3275.3 432 TGCGGGATGTGGTATTCCTA 576 68.54 −1.30 1.199 −1.243 0.271

2830.8 433 GCGGGATGTGGTATTCCTAA 577 66.48 −1.30 0.896 −1.243 0.083

2620.5 434 CGGGATGTGGTATTCCTAAT 578 62.46 −1.30 0.307 −1.243 −0.282

1827.8 435 GGGATGTGGTATTCCTAATT 579 62.37 −1.30 0.294 −1.243 −0.290

1957.4 436 GGATGTGGTATTCCTAATTG 580 59.71 −0.90 −0.097 −0.895 −0.400

1686.2 437 GATGTGGTATTCCTAATTGA 581 58.45 −0.20 −0.281 −0.286 −0.283

1395.0 438 ATGTGGTATTCCTAATTGAA 582 55.24 −0.20 −0.752 −0.286 −0.575

1245.7 439 TGTGGTATTCCTAATTGAAC 583 55.76 −0.30 −0.675 −0.373 −0.561

1314.0 440 GTGGTATTCCTAATTGAACT 584 57.73 −0.30 −0.387 −0.373 −0.382

1818.7 441 TGGTATTCCTAATTGAACTT 585 55.15 −0.30 −0.765 −0.373 −0.616

880.3 442 GGTATTCCTAATTGAACTTC 586 56.47 −0.30 −0.572 −0.373 −0.496

1419.0 443 GTATTCCTAATTGAACTTCC 587 57.76 −0.30 −0.383 −0.373 −0.379

1567.9 444 TATTCCTAATTGAACTTCCC 588 58.57 −0.30 −0.264 −0.373 −0.306

1959.4 445 ATTCCTAATTGAACTTCCCA 589 60.26 −0.30 −0.016 −0.373 −0.152

2971.8 446 TTCCTAATTGAACTTCCCAG 590 60.45 −0.10 0.013 −0.199 −0.068

1898.5 447 TCCTAATTGAACTTCCCAGA 591 61.36 0.70 0.146 0.496 0.279

1392.3 448 CCTAATTGAACTTCCCAGAA 592 58.27 0.70 −0.308 0.496 −0.002

1143.2 449 CTAATTGAACTTCCCAGAAG 593 54.92 −0.70 −0.800 −0.721 −0.770

427.7 450 TAATTGAACTTCCCAGAAGT 594 55.84 −1.90 −0.664 −1.764 −1.082

148.5 451 AATTGAACTTCCCAGAAGTC 595 57.61 −2.10 −0.404 −1.938 −0.987

259.1 452 ATTGAACTTCCCAGAAGTCT 596 61.42 −2.10 0.154 −1.938 −0.641−0.751 241.9 453 TTGAACTTCCCAGAAGTCTT 597 61.76 −2.10 0.205 −1.938−0.609 −0.730 808.1 454 TGAACTTCCCAGAAGTCTTG 598 61.34 −2.10 0.143−1.938 −0.648

351.6 455 GAACTTCCCAGAAGTCTTGA 599 62.71 −2.10 0.344 −1.938 −0.523

499.7 456 AACTTCCCAGAAGTCTTGAG 600 61.63 −2.10 0.186 −1.938 −0.621

407.4 457 ACTTCCCAGAAGTCTTGAGT 601 66.97 −1.90 0.969 −1.764 −0.069

492.1 458 CTTCCCAGAAGTCTTGAGTT 602 66.75 −1.00 0.937 −0.982 0.208

736.1 459 TTCCCAGAAGTCTTGAGTTC 603 66.31 −0.20 0.872 −0.286 0.432

815.2 460 TCCCAGAAGTCTTGAGTTCT 604 67.98 −1.20 1.116 −1.156 0.253

888.8 461 CCCAGAAGTCTTGAGTTCTC 605 67.98 −1.40 1.116 −1.330 0.187

2021.6 462 CCAGAAGTCTTGAGTTCTCT 606 66.10 −1.40 0.842 −1.330 0.017

1988.5 463 CAGAAGTCTTGAGTTCTCTT 607 62.41 −1.40 0.300 −1.330 −0.319

2008.8 464 AGAAGTCTTGAGTTCTCTTA 608 60.43 −1.20 0.009 −1.156 −0.434

2631.8 465 GAAGTCTTGAGTTCTCTTAT 609 60.20 −0.50 −0.025 −0.547 −0.223

3052.8 466 AAGTCTTGAGTTCTCTTATT 610 59.12 0.30 −0.183 0.149 −0.057

3509.3 467 AGTCTTGAGTTCTCTTATTA 611 60.75 0.30 0.056 0.149 0.091

3221.6 468 GTCTTGAGTTCTCTTATTAA 612 58.29 0.30 −0.305 0.149 −0.132

3677.1 469 TCTTGAGTTCTCTTATTAAG 613 55.25 0.30 −0.751 0.149 −0.409

1176.6 470 CTTGAGTTCTCTTATTAAGT 614 57.04 0.10 −0.488 −0.025 −0.312

1168.1 471 TTGAGTTCTCTTATTAAGTT 615 55.29 0.10 −0.745 −0.025 −0.471

666.3 472 TGAGTTCTCTTATTAAGTTC 616 56.35 0.10 −0.589 −0.025 −0.375

674.0 473 GAGTTCTCTTATTAAGTTCT 617 58.57 0.10 −0.263 −0.025 −0.173

1471.4 474 AGTTCTCTTATTAAGTTCTC 618 58.61 0.10 −0.257 −0.025 −0.169

1493.5 475 GTTCTCTTATTAAGTTCTCT 619 60.59 0.10 0.032 −0.025 0.011

2191.5 476 TTCTCTTATTAAGTTCTCTG 620 57.16 0.10 −0.471 −0.025 −0.301

1410.3 477 TCTCTTATTAAGTTCTCTGA 621 58.23 0.10 −0.314 −0.025 −0.204

1262.8 478 CTCTTATTAAGTTCTCTGAA 622 54.79 0.10 −0.817 −0.025 −0.516

1072.9 479 TCTTATTAAGTTCTCTGAAA 623 50.95 0.10 −1.382 −0.025 −0.866

540.9 480 CTTATTAAGTTCTCTGAAAT 624 49.77 0.50 −1.554 0.323 −0.841

539.2 481 TTATTAAGTTCTCTGAAATC 625 48.99 0.50 −1.668 0.323 −0.912 −0.768709.0 482 TATTAAGTTCTCTGAAATCT 626 50.64 0.50 −1.427 0.323 −0.762 −0.775978.1 483 ATTAAGTTCTCTGAAATCTA 627 50.64 0.50 −1.427 0.323 −0.762 −0.7321217.7 484 TTAAGTTCTCTGAAATCTAC 628 51.15 0.50 −1.352 0.323 −0.716

1748.1 485 TAAGTTCTCTGAAATCTACT 629 52.79 0.50 −1.112 0.323 −0.567

2511.5 486 AAGTTCTCTGAAATCTACTA 630 52.79 0.50 −1.112 0.323 −0.567

2997.2 487 AGTTCTCTGAAATCTACTAA 631 52.79 0.50 −1.112 0.323 −0.567

2887.6 488 GTTCTCTGAAATCTACTAAT 632 52.65 0.50 −1.133 0.323 −0.580

4421.3 489 TTCTCTGAAATCTACTAATT 633 50.14 0.70 −1.500 0.496 −0.741−0.832 1937.7 490 TCTCTGAAATCTACTAATTT 634 50.14 0.20 −1.500 0.062−0.906 −0.962 1773.3 491 CTCTGAAATCTACTAATTTT 635 49.31 −0.30 −1.622−0.373 −1.147 −1.102 1491.1 492 TCTGAAATCTACTAATTTTC 636 48.55 −0.60−1.734 −0.634 −1.316 −1.171 376.6 493 CTGAAATCTACTAATTTTCT 637 49.31−1.30 −1.622 −1.243 −1.478 −1.178 371.9 494 TGAAATCTACTAATTTTCTC 63848.55 −1.30 −1.734 −1.243 −1.547 −1.092 415.2 495 GAAATCTACTAATTTTCTCC639 52.45 −0.90 −1.161 −0.895 −1.060 −0.938 1097.9 496AAATCTACTAATTTTCTCCA 640 52.47 −0.10 −1.158 −0.199 −0.794 −0.778 1429.1497 AATCTACTAATTTTCTCCAT 641 54.25 0.90 −0.897 0.670 −0.301

1812.5 498 ATCTACTAATTTTCTCCATT 642 56.46 1.00 −0.572 0.757 −0.067

1943.4 499 TCTACTAATTTTCTCCATTT 643 56.80 0.50 −0.523 0.323 −0.202

1506.1 500 CTACTAATTTTCTCCATTTA 644 54.93 0.50 −0.797 0.323 −0.372

1694.7 501 TACTAATTTTCTCCATTTAG 645 53.14 0.30 −1.060 0.149 −0.600

946.7 502 ACTAATTTTCTCCATTTAGT 646 56.69 −0.70 −0.539 −0.721 −0.608

1114.3 503 CTAATTTTCTCCATTTAGTA 647 55.57 0.00 −0.704 −0.112 −0.479

963.9 504 TAATTTTCTCCATTTAGTAC 648 54.12 0.50 −0.917 0.323 −0.446

1347.9 505 AATTTTCTCCATTTAGTACT 649 56.69 0.70 −0.539 0.496 −0.145

2067.7 506 ATTTTCTCCATTTAGTACTG 650 58.66 0.80 −0.250 0.583 0.067

2724.2 507 TTTTCTCCATTTAGTACTGT 651 61.92 0.60 0.228 0.410 0.297

3367.9 508 TTTCTCCATTTAGTACTGTC 652 63.10 0.60 0.401 0.410 0.404

5235.8 509 TTCTCCATTTAGTACTGTCT 653 64.84 0.60 0.656 0.410 0.562

6423.5 510 TCTCCATTTAGTACTGTCTT 654 64.84 0.60 0.656 0.410 0.562

7758.9 511 CTCCATTTAGTACTGTCTTT 655 63.63 0.60 0.479 0.410 0.453

8001.5 512 TCCATTTAGTACTGTCTTTT 656 61.92 0.60 0.228 0.410 0.297

5512.4 513 CCATTTAGTACTGTCTTTTT 657 60.78 0.60 0.061 0.410 0.194

5300.0 514 CATTTAGTACTGTCTTTTTT 658 57.04 0.80 −0.489 0.583 −0.081

3902.1 515 ATTTAGTACTGTCTTTTTTC 659 57.08 0.80 −0.482 0.583 −0.077

4641.8 516 TTTAGTACTGTCTTTTTTCT 660 59.26 0.80 −0.162 0.583 0.121

4888.4 517 TTAGTACTGTCTTTTTTCTT 661 59.26 0.80 −0.162 0.583 0.121

5477.3 518 TAGTACTGTCTTTTTTCTTT 662 59.26 0.80 −0.162 0.583 0.121

5064.9 519 AGTACTGTCTTTTTTCTTTA 663 59.26 1.00 −0.162 0.757 0.187

5580.3 520 GTACTGTCTTTTTTCTTTAT 664 59.04 2.70 −0.195 2.236 0.729

5478.3 521 TACTGTCTTTTTTCTTTATG 665 55.71 2.90 −0.683 2.410 0.492

2275.5 522 ACTGTCTTTTTTCTTTATGG 666 59.07 1.70 −0.190 1.366 0.402

1730.8 523 CTGTCTTTTTTCTTTATGGC 667 62.92 1.70 0.374 1.366 0.751

2405.5 524 TGTCTTTTTTCTTTATGGCA 668 62.14 1.70 0.260 1.366 0.680

1942.0 525 GTCTTTTTTCTTTATGGCAA 669 60.05 1.50 −0.047 1.192 0.424

2085.6 526 TCTTTTTTCTTTATGGCAAA 670 54.99 0.60 −0.788 0.410 −0.333

493.2 527 CTTTTTTCTTTATGGCAAAT 671 53.75 0.10 −0.971 −0.025 −0.612

532.7 528 TTTTTTCTTTATGGCAAATA 672 51.30 0.10 −1.331 −0.025 −0.835

280.0 529 TTTTTCTTTATGGCAAATAC 673 51.49 0.10 −1.302 −0.025 −0.817

440.8 530 TTTTCTTTATGGCAAATACT 674 53.08 0.10 −1.069 −0.025 −0.672

463.1 531 TTTCTTTATGGCAAATACTG 675 52.74 0.10 −1.119 −0.025 −0.704

579.0 532 TTCTTTATGGCAAATACTGG 676 54.90 0.10 −0.802 −0.025 −0.507

673.7 533 TCTTTATGGCAAATACTGGA 677 55.85 0.10 −0.663 −0.025 −0.421

837.0 534 CTTTATGGCAAATACTGGAG 678 54.78 0.10 −0.820 −0.025 −0.518

1061.9 535 TTTATGGCAAATACTGGAGT 679 55.74 0.30 −0.679 0.149 −0.365

855.0 536 TTATGGCAAATACTGGAGTA 680 54.87 0.60 −0.806 0.410 −0.344

775.0 537 TATGGCAAATACTGGAGTAT 681 54.56 0.00 −0.852 −0.112 −0.571

773.6 538 ATGGCAAATACTGGAGTATT 682 55.42 −1.00 −0.726 −0.982 −0.823

702.5 539 TGGCAAATACTGGAGTATTG 683 55.37 −1.20 −0.733 −1.156 −0.893−0.775 387.5 540 GGCAAATACTGGAGTATTGT 684 58.33 −1.20 −0.298 −1.156−0.624 −0.924 435.3 541 GCAAATACTGGAGTATTGTA 685 55.24 −1.20 −0.753−1.156 −0.906 −0.974 93.7 542 CAAATACTGGAGTATTGTAT 686 51.30 −1.20−1.331 −1.156 −1.264 −0.913 50.0 543 AAATACTGGAGTATTGTATG 687 49.96−1.20 −1.527 −1.156 −1.386 −0.809 50.4 544 AATACTGGAGTATTGTATGG 68854.30 −1.00 −0.890 −0.982 −0.925

64.7 545 ATACTGGAGTATTGTATGGA 689 57.60 −0.30 −0.406 −0.373 −0.394

76.0 546 TACTGGAGTATTGTATGGAT 690 57.60 0.40 −0.406 0.236 −0.162

86.0 547 ACTGGAGTATTGTATGGATT 691 58.53 1.30 −0.269 1.018 0.220

123.4 548 CTGGAGTATTGTATGGATTC 692 59.39 2.00 −0.144 1.627 0.529

121.5 549 TGGAGTATTGTATGGATTCT 693 59.39 1.80 −0.144 1.453 0.463

641.3 550 GGAGTATTGTATGGATTCTC 694 60.95 0.60 0.086 0.410 0.209

161.5 551 GAGTATTGTATGGATTCTCA 695 59.52 0.60 −0.124 0.410 0.079

129.9 552 AGTATTGTATGGATTCTCAG 696 58.31 1.10 −0.302 0.844 0.134

88.7 553 GTATTGTATGGATTCTCAGG 697 60.87 1.10 0.074 0.844 0.367

112.5 554 TATTGTATGGATTCTCAGGC 698 61.97 1.10 0.236 0.844 0.467

134.6 555 ATTGTATGGATTCTCAGGCC 699 66.52 1.10 0.902 0.844 0.880

191.6 556 TTGTATGGATTCTCAGGCCC 700 70.34 0.70 1.463 0.496 1.096

254.5 557 TGTATGGATTCTCAGGCCCA 701 71.11 0.20 1.577 0.062 1.001

332.2 558 GTATGGATTCTCAGGCCCAA 702 68.95 0.00 1.259 −0.112 0.738

415.6 559 TATGGATTCTCAGGCCCAAT 703 65.78 0.00 0.795 −0.112 0.450

285.0 560 ATGGATTCTCAGGCCCAATT 704 66.68 0.00 0.925 −0.112 0.531

464.0 561 TGGATTCTCAGGCCCAATTT 705 67.04 0.20 0.979 0.062 0.630

492.5 562 GGATTCTCAGGCCCAATTTT 706 67.51 1.10 1.048 0.844 0.970

639.7 563 GATTCTCAGGCCCAATTTTT 707 65.34 1.30 0.729 1.018 0.839

512.4 564 ATTCTCAGGCCCAATTTTTG 708 63.94 0.60 0.524 0.410 0.481

393.4 565 TTCTCAGGCCCAATTTTTGA 709 65.24 0.20 0.716 0.062 0.467

334.3 566 TCTCAGGCCCAATTTTTGAA 710 62.85 0.20 0.364 0.062 0.249

308.2 567 CTCAGGCCCAATTTTTGAAA 711 59.62 0.20 −0.109 0.062 −0.044

199.2 568 TCAGGCCCAATTTTTGAAAT 712 57.85 0.20 −0.369 0.062 −0.205

164.3 569 CAGGCCCAATTTTTGAAATT 713 56.95 −0.50 −0.501 −0.547 −0.518

125.6 570 AGGCCCAATTTTTGAAATTT 714 56.09 −1.00 −0.627 −0.982 −0.762

102.6 571 GGCCCAATTTTTGAAATTTT 715 56.23 −1.00 −0.606 −0.982 −0.749

91.6 572 GCCCAATTTTTGAAATTTTC 716 55.07 −1.00 −0.777 −0.982 −0.855−0.806 76.2 573 CCCAATTTTTGAAATTTTCC 717 54.96 −1.00 −0.792 −0.982−0.864 −0.881 78.8 574 CCAATTTTTGAAATTTTCCC 718 54.96 −1.00 −0.792−0.982 −0.864 −0.841 84.8 575 CAATTTTTGAAATTTTCCCT 719 53.17 −1.00−1.055 −0.982 −1.027 −0.755 162.0 576 AATTTTTGAAATTTTCCCTT 720 52.25−0.80 −1.190 −0.808 −1.045

539.5 577 ATTTTTGAAATTTTCCCTTC 721 55.17 0.10 −0.762 −0.025 −0.482

1787.3 578 TTTTTGAAATTTTCCCTTCC 722 58.88 0.10 −0.219 −0.025 −0.145

6354.2 579 TTTTGAAATTTTCCCTTCCT 723 60.39 0.10 0.004 −0.025 −0.007

9513.6 580 TTTGAAATTTTCCCTTCCTT 724 60.39 0.10 0.004 −0.025 −0.007

10660.0 581 TTGAAATTTTCCCTTCCTTT 725 60.39 0.10 0.004 −0.025 −0.007

11202.0 582 TGAAATTTTCCCTTCCTTTT 726 60.39 0.10 0.004 −0.025 −0.007

11543.0 583 GAAATTTTCCCTTCCTTTTC 727 61.81 0.40 0.212 0.236 0.221

14774.0 584 AAATTTTCCCTTCCTTTTCC 728 64.17 1.20 0.557 0.931 0.699 0.95218197.0 585 AATTTTCCCTTCCTTTTCCA 729 67.39 1.70 1.030 1.366 1.158 1.30721410.0 586 ATTTTCCCTTCCTTTTCCAT 730 69.58 4.00 1.351 3.366 2.117 1.67922869.0 587 TTTTCCCTTCCTTTTCCATT 731 69.96 5.00 1.408 4.236 2.482 2.03921818.0 588 TTTCCCTTCCTTTTCCATTT 732 69.96 5.00 1.408 4.236 2.482 2.11321341.0 589 TTCCCTTCCTTTTCCATTTC 733 71.19 5.00 1.588 4.236 2.594 2.08522063.0 590 TCCCTTCCTTTTCCATTTCT 734 72.77 5.00 1.820 4.236 2.738 1.86322152.0 591 CCCTTCCTTTTCCATTTCTG 735 71.01 0.90 1.561 0.670 1.223 1.57120764.0 592 CCTTCCTTTTCCATTTCTGT 736 70.68 0.20 1.513 0.062 0.961 1.28912579.0 593 CTTCCTTTTCCATTTCTGTA 737 66.30 0.20 0.870 0.062 0.563 0.9459036.3 594 TTCCTTTTCCATTTCTGTAC 738 64.87 0.20 0.660 0.062 0.433

8251.8 595 TCCTTTTCCATTTCTGTACA 739 65.74 0.20 0.788 0.062 0.512

20788.0 596 CCTTTTCCATTTCTGTACAA 740 62.11 0.20 0.256 0.062 0.182

7073.9 597 CTTTTCCATTTCTGTACAAA 741 56.39 0.20 −0.583 0.062 −0.338

2932.4 598 TTTTCCATTTCTGTACAAAT 742 54.49 0.20 −0.862 0.062 −0.511

1897.3 599 TTTCCATTTCTGTACAAATT 743 54.49 −0.30 −0.862 −0.373 −0.676

2158.1 600 TTCCATTTCTGTACAAATTT 744 54.49 −0.30 −0.862 −0.373 −0.676

2215.9 601 TCCATTTCTGTACAAATTTC 745 55.43 −0.30 −0.724 −0.373 −0.591

2168.6 602 CCATTTCTGTACAAATTTCT 746 56.07 −0.30 −0.631 −0.373 −0.533

2025.8 603 CATTTCTGTACAAATTTCTA 747 51.65 −0.30 −1.278 −0.373 −0.934

1277.2 604 ATTTCTGTACAAATTTCTAC 748 50.83 −0.10 −1.398 −0.199 −0.943−0.736 1944.8 605 TTTCTGTACAAATTTCTACT 749 52.78 0.40 −1.112 0.236−0.600 −0.790 2504.3 606 TTCTGTACAAATTTCTACTA 750 51.90 0.40 −1.2420.236 −0.681 −0.876 2941.5 607 TCTGTACAAATTTCTACTAA 751 49.84 0.40−1.544 0.236 −0.868 −0.846 2694.8 608 CTGTACAAATTTCTACTAAT 752 48.730.40 −1.707 0.236 −0.969 −0.827 2610.7 609 TGTACAAATTTCTACTAATG 75346.88 0.40 −1.979 0.236 −1.137 −0.845 1678.1 610 GTACAAATTTCTACTAATGC754 50.66 0.60 −1.424 0.410 −0.727 −0.854 5877.3 611TACAAATTTCTACTAATGCT 755 49.82 0.60 −1.547 0.410 −0.803 −0.849 4461.0612 ACAAATTTCTACTAATGCTT 756 50.65 0.60 −1.425 0.410 −0.728 −0.8165943.2 613 CAAATTTCTACTAATGCTTT 757 50.46 0.60 −1.453 0.410 −0.745−0.753 6492.9 614 AAATTTCTACTAATGCTTTT 758 49.47 0.60 −1.599 0.410−0.836 −0.745 6875.0 615 AATTTCTACTAATGCTTTTA 759 50.61 0.60 −1.4310.410 −0.731

7950.3 616 ATTTCTACTAATGCTTTTAT 760 52.40 0.20 −1.169 0.062 −0.701

8314.8 617 TTTCTACTAATGCTTTTATT 761 52.72 0.20 −1.122 0.062 −0.672

6885.8 618 TTCTACTAATGCTTTTATTT 762 52.72 0.20 −1.122 0.062 −0.672

6443.2 619 TCTACTAATGCTTTTATTTT 763 52.72 0.20 −1.122 0.062 −0.672−0.731 6331.0 620 CTACTAATGCTTTTATTTTT 764 51.81 0.20 −1.255 0.062−0.755

5952.5 621 TACTAATGCTTTTATTTTTT 765 50.18 0.20 −1.494 0.062 −0.903

2662.8 622 ACTAATGCTTTTATTTTTTC 766 51.96 0.20 −1.233 0.062 −0.741

3034.0 623 CTAATGCTTTTATTTTTTCT 767 53.41 0.20 −1.021 0.062 −0.609

2198.5 624 TAATGCTTTTATTTTTTCTT 768 51.76 0.40 −1.263 0.236 −0.694

1670.1 625 AATGCTTTTATTTTTTCTTC 769 53.61 1.10 −0.992 0.844 −0.294

3039.4 626 ATGCTTTTATTTTTTCTTCT 770 57.66 2.10 −0.397 1.714 0.405

3873.8 627 TGCTTTTATTTTTTCTTCTG 771 57.60 2.80 −0.406 2.323 0.631

3609.7 628 GCTTTTATTTTTTCTTCTGT 772 60.96 3.10 0.087 2.583 1.036

4891.4 629 CTTTTATTTTTTCTTCTGTC 773 57.96 3.10 −0.353 2.583 0.763

3071.6 630 TTTTATTTTTTCTTCTGTCA 774 57.22 3.10 −0.461 2.583 0.696

2667.2 631 TTTATTTTTTCTTCTGTCAA 775 54.81 1.70 −0.816 1.366 0.013

2293.1 632 TTATTTTTTCTTCTGTCAAT 776 54.46 1.20 −0.866 0.931 −0.183

2123.0 633 TATTTTTTCTTCTGTCAATG 777 54.08 1.20 −0.922 0.931 −0.218

1914.7 634 ATTTTTTCTTCTGTCAATGG 778 57.36 1.20 −0.442 0.931 0.080

2174.1 635 TTTTTTCTTCTGTCAATGGC 779 61.67 1.20 0.192 0.931 0.473

3659.7 636 TTTTTCTTCTGTCAATGGCC 780 65.26 1.20 0.717 0.931 0.799

5217.7 637 TTTTCTTCTGTCAATGGCCA 781 66.11 1.20 0.843 0.931 0.877

4559.7 638 TTTCTTCTGTCAATGGCCAT 782 65.73 1.00 0.787 0.757 0.776

4347.7 639 TTCTTCTGTCAATGGCCATT 783 65.73 1.00 0.787 0.757 0.776

5267.4 640 TCTTCTGTCAATGGCCATTG 784 65.26 −0.60 0.718 −0.634 0.204

3922.8 641 CTTCTGTCAATGGCCATTGT 785 66.97 −1.30 0.968 −1.243 0.128

3608.6 642 TTCTGTCAATGGCCATTGTT 786 65.36 −1.30 0.733 −1.243 −0.018

1881.6 643 TCTGTCAATGGCCATTGTTT 787 65.36 −1.30 0.733 −1.243 −0.018

1658.0 644 CTGTCAATGGCCATTGTTTA 788 63.32 −1.30 0.433 −1.243 −0.204

1369.8 645 TGTCAATGGCCATTGTTTAA 789 59.38 −1.30 −0.144 −1.243 −0.562

605.8 646 GTCAATGGCCATTGTTTAAC 790 59.99 −1.30 −0.055 −1.243 −0.506

933.2 647 TCAATGGCCATTGTTTAACT 791 58.93 −1.30 −0.211 −1.243 −0.603

441.8 648 CAATGGCCATTGTTTAACTT 792 57.97 −0.90 −0.352 −0.895 −0.558

545.6 649 AATGGCCATTGTTTAACTTT 793 57.07 0.90 −0.483 0.670 −0.045

781.4 650 ATGGCCATTGTTTAACTTTT 794 59.31 0.90 −0.156 0.670 0.158

1027.3 651 TGGCCATTGTTTAACTTTTG 795 59.24 0.90 −0.165 0.670 0.152

1102.5 652 GGCCATTGTTTAACTTTTGG 796 61.84 0.30 0.216 0.149 0.190

935.7 653 GCCATTGTTTAACTTTTGGG 797 61.84 −0.10 0.216 −0.199 0.058

403.7 654 CCATTGTTTAACTTTTGGGC 798 61.84 0.30 0.216 0.149 0.190

269.3 655 CATTGTTTAACTTTTGGGCC 799 61.84 0.90 0.216 0.670 0.389

296.8 656 ATTGTTTAACTTTTGGGCCA 800 61.84 0.90 0.216 0.670 0.389

449.4 657 TTGTTTAACTTTTGGGCCAT 801 61.84 0.90 0.216 0.670 0.389

448.1 658 TGTTTAACTTTTGGGCCATC 802 62.91 0.90 0.373 0.670 0.486

584.9 659 GTTTAACTTTTGGGCCATCC 803 66.73 0.40 0.934 0.236 0.669

1032.4 660 TTTAACTTTTGGGCCATCCA 804 64.79 −0.70 0.649 −0.721 0.128

737.8 661 TTAACTTTTGGGCCATCCAT 805 64.44 −1.20 0.598 −1.156 −0.069

950.2 662 TAACTTTTGGGCCATCCATT 806 64.44 −1.20 0.598 −1.156 −0.069

1308.0 663 AACTTTTGGGCCATCCATTC 807 66.42 −1.20 0.888 −1.156 0.111

2360.1 664 ACTTTTGGGCCATCCATTCC 808 72.21 −1.20 1.738 −1.156 0.638

4946.0 665 CTTTTGGGCCATCCATTCCT 809 73.53 −1.20 1.930 −1.156 0.758

6789.2 666 TTTTGGGCCATCCATTCCTG 810 71.49 −1.20 1.632 −1.156 0.573

8150.6 667 TTTGGGCCATCCATTCCTGG 811 73.62 −1.20 1.945 −1.156 0.766

7589.0 668 TTGGGCCATCCATTCCTGGC 812 77.43 −2.80 2.504 −2.547 0.584

13914.0 669 TGGGCCATCCATTCCTGGCT 813 78.94 −3.50 2.725 −3.156 0.490

17513.0 670 GGGCCATCCATTCCTGGCTT 814 79.51 −3.50 2.809 −3.156 0.542

19883.0 671 GGCCATCCATTCCTGGCTTT 815 77.37 −3.50 2.494 −3.156 0.347

20103.0 672 GCCATCCATTCCTGGCTTTA 816 74.28 −3.10 2.040 −2.808 0.198

18622.0 673 CCATCCATTCCTGGCTTTAA 817 67.92 −1.30 1.109 −1.243 0.215

16915.0 674 CATCCATTCCTGGCTTTAAT 818 64.36 −1.30 0.585 −1.243 −0.109

13910.0 675 ATCCATTCCTGGCTTTAATT 819 63.53 −1.30 0.464 −1.243 −0.185

12524.0 676 TCCATTCCTGGCTTTAATTT 820 63.88 −1.30 0.516 −1.243 −0.152

11890.0 677 CCATTCCTGGCTTTAATTTT 821 62.81 −0.90 0.359 −0.895 −0.118

12839.0 678 CATTCCTGGCTTTAATTTTA 822 58.55 0.90 −0.266 0.670 0.090

9726.8 679 ATTCCTGGCTTTAATTTTAC 823 57.84 1.50 −0.371 1.192 0.223

8499.7 680 TTCCTGGCTTTAATTTTACT 824 59.78 1.90 −0.086 1.540 0.532

6800.4 681 TCCTGGCTTTAATTTTACTG 825 59.37 1.90 −0.146 1.540 0.494

5445.6 682 CCTGGCTTTAATTTTACTGG 826 60.53 1.90 0.024 1.540 0.600

2901.6 683 CTGGCTTTAATTTTACTGGT 827 59.77 1.90 −0.087 1.540 0.531

1174.2 684 TGGCTTTAATTTTACTGGTA 828 57.25 1.90 −0.458 1.540 0.301

521.3 685 GGCTTTAATTTTACTGGTAC 829 57.86 1.90 −0.368 1.540 0.357

611.1 686 GCTTTAATTTTACTGGTACA 830 56.55 1.80 −0.560 1.453 0.205

287.6 687 CTTTAATTTTACTGGTACAG 831 52.66 0.40 −1.130 0.236 −0.611

109.5 688 TTTAATTTTACTGGTACAGT 832 53.62 −0.80 −0.989 −0.808 −0.920

59.5 689 TTAATTTTACTGGTACAGTC 833 54.59 −1.00 −0.847 −0.982 −0.898

62.1 690 TAATTTTACTGGTACAGTCT 834 56.28 −1.00 −0.599 −0.982 −0.745

59.4 691 AATTTTACTGGTACAGTCTC 835 58.27 −1.00 −0.308 −0.982 −0.564

68.0 692 ATTTTACTGGTACAGTCTCA 836 61.78 −1.00 0.207 −0.982 −0.245

72.9 693 TTTTACTGGTACAGTCTCAA 837 59.61 −1.00 −0.111 −0.982 −0.442

62.2 694 TTTACTGGTACAGTCTCAAT 838 59.25 −1.00 −0.164 −0.982 −0.475

64.5 695 TTACTGGTACAGTCTCAATA 839 58.30 −1.00 −0.303 −0.982 −0.561

53.5 696 TACTGGTACAGTCTCAATAG 840 58.15 −1.00 −0.326 −0.982 −0.575

57.8 697 ACTGGTACAGTCTCAATAGG 841 61.44 −0.80 0.157 −0.808 −0.210

341.0 698 CTGGTACAGTCTCAATAGGG 842 63.55 0.10 0.467 −0.025 0.280

54.8 699 TGGTACAGTCTCAATAGGGC 843 65.89 1.10 0.810 0.844 0.823

47.1 700 GGTACAGTCTCAATAGGGCT 844 68.08 0.90 1.131 0.670 0.956

59.7 701 GTACAGTCTCAATAGGGCTA 845 64.73 0.70 0.640 0.496 0.586

47.0 702 TACAGTCTCAATAGGGCTAA 846 59.35 0.70 −0.149 0.496 0.096

49.3 703 ACAGTCTCAATAGGGCTAAT 847 59.91 0.70 −0.067 0.496 0.147

55.0 704 CAGTCTCAATAGGGCTAATG 848 59.29 0.70 −0.158 0.496 0.091

49.0 705 AGTCTCAATAGGGCTAATGG 849 60.62 0.90 0.037 0.670 0.278

45.7 706 GTCTCAATAGGGCTAATGGG 850 63.00 1.10 0.386 0.844 0.560

115.6 707 TCTCAATAGGGCTAATGGGA 851 61.22 0.40 0.125 0.236 0.167

50.6 708 CTCAATAGGGCTAATGGGAA 852 57.97 1.40 −0.352 1.105 0.202

48.0 709 TCAATAGGGCTAATGGGAAA 853 54.39 1.40 −0.877 1.105 −0.124

50.5 710 CAATAGGGCTAATGGGAAAA 854 51.64 1.80 −1.281 1.453 −0.242

44.1 711 AATAGGGCTAATGGGAAAAT 855 50.45 1.90 −1.454 1.540 −0.316

43.1 712 ATAGGGCTAATGGGAAAATT 856 52.34 1.00 −1.178 0.757 −0.442

45.2 713 TAGGGCTAATGGGAAAATTT 857 52.63 0.50 −1.135 0.323 −0.581

47.4 714 AGGGCTAATGGGAAAATTTA 858 52.63 0.50 −1.135 0.323 −0.581

50.0 715 GGGCTAATGGGAAAATTTAA 859 50.89 0.50 −1.390 0.323 −0.739 −0.86747.8 716 GGCTAATGGGAAAATTTAAA 860 47.14 0.50 −1.940 0.323 −1.080 −1.02250.2 717 GCTAATGGGAAAATTTAAAG 861 45.00 0.50 −2.254 0.323 −1.275 −1.09643.0 718 CTAATGGGAAAATTTAAAGT 862 43.95 0.50 −2.408 0.323 −1.371 −1.08857.0 719 TAATGGGAAAATTTAAAGTG 863 42.27 0.50 −2.655 0.323 −1.524 −1.07258.7 720 AATGGGAAAATTTAAAGTGC 864 46.18 0.70 −2.081 0.496 −1.102 −1.011183.6 721 ATGGGAAAATTTAAAGTGCA 865 48.90 1.70 −1.682 1.366 −0.524 −0.924303.4 722 TGGGAAAATTTAAAGTGCAA 866 47.39 1.80 −1.903 1.453 −0.628 −0.837135.7 723 GGGAAAATTTAAAGTGCAAC 867 47.84 1.60 −1.838 1.279 −0.653 −0.766241.7 724 GGAAAATTTAAAGTGCAACC 868 49.12 1.20 −1.649 0.931 −0.669 −0.737132.5 725 GAAAATTTAAAGTGCAACCA 869 48.09 1.20 −1.801 0.931 −0.763 −0.758128.8 726 AAAATTTAAAGTGCAACCAA 870 45.57 1.10 −2.171 0.844 −1.025

141.0 727 AAATTTAAAGTGCAACCAAT 871 46.97 1.10 −1.965 0.844 −0.897

282.0 728 AATTTAAAGTGCAACCAATC 872 49.46 1.10 −1.599 0.844 −0.671

948.6 729 ATTTAAAGTGCAACCAATCT 873 52.84 1.10 −1.104 0.844 −0.363

1815.1 730 TTTAAAGTGCAACCAATCTG 874 52.81 1.10 −1.109 0.844 −0.366

3188.2 731 TTAAAGTGCAACCAATCTGA 875 53.71 1.00 −0.976 0.757 −0.317

3566.1 732 TAAAGTGCAACCAATCTGAG 876 53.56 1.00 −0.999 0.757 −0.331

2925.1 733 AAAGTGCAACCAATCTGAGT 877 56.81 1.00 −0.522 0.757 −0.036

3233.2 734 AAGTGCAACCAATCTGAGTC 878 59.99 1.00 −0.055 0.757 0.254

3615.6 735 AGTGCAACCAATCTGAGTCA 879 63.25 1.00 0.422 0.757 0.550

3994.8 736 GTGCAACCAATCTGAGTCAA 880 61.00 1.00 0.093 0.757 0.345

4033.0 737 TGCAACCAATCTGAGTCAAC 881 58.62 1.00 −0.257 0.757 0.128

3380.2 738 GCAACCAATCTGAGTCAACA 882 59.87 1.00 −0.073 0.757 0.242

4288.7 739 CAACCAATCTGAGTCAACAG 883 56.22 −0.30 −0.608 −0.373 −0.519

744.1 740 AACCAATCTGAGTCAACAGA 884 56.24 −1.60 −0.605 −1.504 −0.946−0.757 392.2 741 ACCAATCTGAGTCAACAGAT 885 58.10 −2.30 −0.332 −2.112−1.009 −1.030 158.1 742 CCAATCTGAGTCAACAGATT 886 57.90 −3.30 −0.362−2.982 −1.357 −1.219 70.8 743 CAATCTGAGTCAACAGATTT 887 54.41 −3.80−0.874 −3.417 −1.840 −1.262 190.0 744 AATCTGAGTCAACAGATTTC 888 54.37−3.60 −0.880 −3.243 −1.778 −1.168 87.7 745 ATCTGAGTCAACAGATTTCT 88958.37 −2.60 −0.293 −2.373 −1.084 −1.017 152.7 746 TCTGAGTCAACAGATTTCTT890 58.73 −1.90 −0.241 −1.764 −0.820 −0.797 270.5 747CTGAGTCAACAGATTTCTTC 891 58.73 −0.30 −0.241 −0.373 −0.291

498.7 748 TGAGTCAACAGATTTCTTCC 892 60.70 0.20 0.049 0.062 0.054

891.0 749 GAGTCAACAGATTTCTTCCA 893 62.06 0.20 0.248 0.062 0.177

1509.8 750 AGTCAACAGATTTCTTCCAA 894 58.66 0.20 −0.250 0.062 −0.132

1009.3 751 GTCAACAGATTTCTTCCAAT 895 58.47 0.20 −0.279 0.062 −0.149

1198.0 752 TCAACAGATTTCTTCCAATT 896 55.86 0.20 −0.661 0.062 −0.387

680.5 753 CAACAGATTTCTTCCAATTA 897 54.08 0.20 −0.922 0.062 −0.548

762.5 754 AACAGATTTCTTCCAATTAT 898 52.82 0.20 −1.107 0.062 −0.663

689.8 755 ACAGATTTCTTCCAATTATG 899 54.58 0.20 −0.849 0.062 −0.503

715.1 756 CAGATTTCTTCCAATTATGT 900 56.99 0.20 −0.496 0.062 −0.284

833.8 757 AGATTTCTTCCAATTATGTT 901 56.02 0.20 −0.638 0.062 −0.372

1067.7 758 GATTTCTTCCAATTATGTTG 902 55.80 0.30 −0.670 0.149 −0.359

1225.9 759 ATTTCTTCCAATTATGTTGA 903 55.80 −0.10 −0.670 −0.199 −0.491

1028.7 760 TTTCTTCCAATTATGTTGAC 904 56.34 −0.10 −0.591 −0.199 −0.442

1419.0 761 TTCTTCCAATTATGTTGACA 905 57.29 −0.10 −0.452 −0.199 −0.356

1437.4 762 TCTTCCAATTATGTTGACAG 906 57.14 −0.10 −0.474 −0.199 −0.369

1518.3 763 CTTCCAATTATGTTGACAGG 907 58.36 −0.10 −0.295 −0.199 −0.259

1560.3 764 TTCCAATTATGTTGACAGGT 908 59.43 −0.10 −0.138 −0.199 −0.161

1100.0 765 TCCAATTATGTTGACAGGTG 909 59.02 −0.10 −0.198 −0.199 −0.198

1096.4 766 CCAATTATGTTGACAGGTGT 910 60.68 −0.10 0.046 −0.199 −0.047

1103.4 767 CAATTATGTTGACAGGTGTA 911 56.24 0.30 −0.605 0.149 −0.319

738.1 768 AATTATGTTGACAGGTGTAG 912 55.09 1.10 −0.774 0.844 −0.159

596.7 769 ATTATGTTGACAGGTGTAGG 913 59.83 1.10 −0.079 0.844 0.272

548.1 770 TTATGTTGACAGGTGTAGGT 914 63.16 1.10 0.409 0.844 0.575

701.1 771 TATGTTGACAGGTGTAGGTC 915 64.38 −0.20 0.588 −0.286 0.256

724.7 772 ATGTTGACAGGTGTAGGTCC 916 69.08 −0.60 1.278 −0.634 0.551

1129.8 773 TGTTGACAGGTGTAGGTCCT 917 71.21 −0.60 1.591 −0.634 0.745

1214.0 774 GTTGACAGGTGTAGGTCCTA 918 70.75 −0.60 1.523 −0.634 0.703

1425.4 775 TTGACAGGTGTAGGTCCTAC 919 67.83 −0.60 1.095 −0.634 0.438

838.8 776 TGACAGGTGTAGGTCCTACT 920 69.52 −0.90 1.343 −0.895 0.493

1173.1 777 GACAGGTGTAGGTCCTACTA 921 69.06 −0.90 1.275 −0.895 0.450

1367.0 778 ACAGGTGTAGGTCCTACTAA 922 65.30 −0.90 0.723 −0.895 0.108

872.0 779 CAGGTGTAGGTCCTACTAAT 923 64.69 −0.90 0.634 −0.895 0.053

897.6 780 AGGTGTAGGTCCTACTAATA 924 62.84 −0.90 0.362 −0.895 −0.115

962.2 781 GGTGTAGGTCCTACTAATAC 925 63.19 −0.90 0.414 −0.895 −0.083

1382.6 782 GTGTAGGTCCTACTAATACT 926 62.53 −0.90 0.317 −0.895 −0.143

1132.9 783 TGTAGGTCCTACTAATACTG 927 59.27 −0.90 −0.160 −0.895 −0.439

1180.7 784 GTAGGTCCTACTAATACTGT 928 62.53 −0.50 0.317 −0.547 −0.011

1932.9 785 TAGGTCCTACTAATACTGTA 929 58.77 0.70 −0.234 0.496 0.043

1634.4 786 AGGTCCTACTAATACTGTAC 930 59.91 0.50 −0.067 0.323 0.081

2488.1 787 GGTCCTACTAATACTGTACC 931 63.54 0.50 0.466 0.323 0.411

3560.9 788 GTCCTACTAATACTGTACCT 932 62.91 0.50 0.373 0.323 0.354

3850.1 789 TCCTACTAATACTGTACCTA 933 59.31 0.50 −0.155 0.323 0.026

1879.0 790 CCTACTAATACTGTACCTAT 934 57.99 0.50 −0.348 0.323 −0.093

1920.4 791 CTACTAATACTGTACCTATA 935 53.68 0.50 −0.981 0.323 −0.486

1131.2 792 TACTAATACTGTACCTATAG 936 51.92 0.70 −1.240 0.496 −0.580

756.5 793 ACTAATACTGTACCTATAGC 937 56.45 1.20 −0.574 0.931 −0.002

1881.3 794 CTAATACTGTACCTATAGCT 938 57.85 1.20 −0.369 0.931 0.125

2033.6 795 TAATACTGTACCTATAGCTT 939 56.25 1.20 −0.604 0.931 −0.021

1853.9 796 AATACTGTACCTATAGCTTT 940 57.14 1.20 −0.473 0.931 0.060

2462.6 797 ATACTGTACCTATAGCTTTA 941 58.55 1.20 −0.266 0.931 0.189

2436.8 798 TACTGTACCTATAGCTTTAT 942 58.55 1.20 −0.266 0.931 0.189

1865.2 799 ACTGTACCTATAGCTTTATG 943 59.06 1.20 −0.192 0.931 0.235

1682.1 800 CTGTACCTATAGCTTTATGT 944 61.64 1.30 0.187 1.018 0.503

1551.3 801 TGTACCTATAGCTTTATGTC 945 61.08 1.10 0.105 0.844 0.386

1600.1 802 GTACCTATAGCTTTATGTCC 946 65.16 1.10 0.703 0.844 0.757

4094.6 803 TACCTATAGCTTTATGTCCA 947 63.16 1.10 0.409 0.844 0.575

2794.2 804 ACCTATAGCTTTATGTCCAC 948 64.30 1.30 0.577 1.018 0.745

4754.9 805 CCTATAGCTTTATGTCCACA 949 64.94 1.30 0.671 1.018 0.803

4185.4 806 CTATAGCTTTATGTCCACAG 950 61.34 1.10 0.143 0.844 0.409

3284.3 807 TATAGCTTTATGTCCACAGA 951 60.70 1.10 0.048 0.844 0.351

2819.7 808 ATAGCTTTATGTCCACAGAT 952 61.27 0.60 0.132 0.410 0.238

3545.1 809 TAGCTTTATGTCCACAGATT 953 61.63 0.60 0.186 0.410 0.271

4232.6 810 AGCTTTATGTCCACAGATTT 954 62.57 0.60 0.324 0.410 0.356

5252.8 811 GCTTTATGTCCACAGATTTC 955 63.85 0.60 0.511 0.410 0.472

6823.9 812 CTTTATGTCCACAGATTTCT 956 61.56 0.60 0.176 0.410 0.265

4829.8 813 TTTATGTCCACAGATTTCTA 957 58.97 0.60 −0.205 0.410 0.029

4333.7 814 TTATGTCCACAGATTTCTAT 958 58.62 0.60 −0.257 0.410 −0.004

3801.0 815 TATGTCCACAGATTTCTATG 959 58.20 0.60 −0.318 0.410 −0.041

3528.2 816 ATGTCCACAGATTTCTATGA 960 60.12 0.60 −0.036 0.410 0.134

2080.0 817 TGTCCACAGATTTCTATGAG 961 60.34 0.60 −0.004 0.410 0.153

913.8 818 GTCCACAGATTTCTATGAGT 962 63.68 0.60 0.486 0.410 0.457

1228.3 819 TCCACAGATTTCTATGAGTA 963 59.83 0.80 −0.078 0.583 0.173

238.1 820 CCACAGATTTCTATGAGTAT 964 58.43 1.10 −0.285 0.844 0.144

219.4 821 CACAGATTTCTATGAGTATC 965 55.78 0.90 −0.673 0.670 −0.162

138.6 822 ACAGATTTCTATGAGTATCT 966 56.48 −0.10 −0.571 −0.199 −0.430

112.7 823 CAGATTTCTATGAGTATCTG 967 55.85 −1.30 −0.663 −1.243 −0.883

133.8 824 AGATTTCTATGAGTATCTGA 968 55.87 −0.10 −0.659 −0.199 −0.485

296.8 825 GATTTCTATGAGTATCTGAT 969 55.69 0.60 −0.686 0.410 −0.270

279.7 826 ATTTCTATGAGTATCTGATC 970 55.67 0.80 −0.689 0.583 −0.206

484.4 827 TTTCTATGAGTATCTGATCA 971 57.06 0.20 −0.485 0.062 −0.277

502.0 828 TTCTATGAGTATCTGATCAT 972 56.70 −0.50 −0.538 −0.547 −0.541

637.3 829 TCTATGAGTATCTGATCATA 973 55.75 −1.10 −0.678 −1.069 −0.826

489.0 830 CTATGAGTATCTGATCATAC 974 54.95 −1.30 −0.794 −1.243 −0.965

808.7 831 TATGAGTATCTGATCATACT 975 54.95 −1.10 −0.794 −1.069 −0.899−0.738 903.2 832 ATGAGTATCTGATCATACTG 976 55.49 −1.20 −0.715 −1.156−0.883

1709.3 833 TGAGTATCTGATCATACTGT 977 58.64 −1.20 −0.254 −1.156 −0.597

2103.9 834 GAGTATCTGATCATACTGTC 978 60.20 −1.20 −0.025 −1.156 −0.455

3973.4 835 AGTATCTGATCATACTGTCT 979 60.88 −1.00 0.076 −0.982 −0.326

6462.3 836 GTATCTGATCATACTGTCTT 980 61.03 −0.30 0.097 −0.373 −0.081

9749.0 837 TATCTGATCATACTGTCTTA 981 57.16 0.90 −0.470 0.670 −0.037

7817.2 838 ATCTGATCATACTGTCTTAC 982 58.34 0.90 −0.298 0.670 0.070

9683.1 839 TCTGATCATACTGTCTTACT 983 60.42 0.90 0.008 0.670 0.259

8089.0 840 CTGATCATACTGTCTTACTT 984 59.32 0.90 −0.154 0.670 0.159

8696.8 841 TGATCATACTGTCTTACTTT 985 57.63 0.90 −0.401 0.670 0.006

6880.5 842 GATCATACTGTCTTACTTTG 986 57.63 0.90 −0.401 0.670 0.006

7033.7 843 ATCATACTGTCTTACTTTGA 987 57.63 0.90 −0.401 0.670 0.006

5406.5 844 TCATACTGTCTTACTTTGAT 988 57.63 0.70 −0.401 0.496 −0.060

4239.4 845 CATACTGTCTTACTTTGATA 989 55.68 0.70 −0.688 0.496 −0.238

3727.4 846 ATACTGTCTTACTTTGATAA 990 52.44 0.70 −1.163 0.496 −0.533

2665.5 847 TACTGTCTTACTTTCATAAA 991 50.65 0.70 −1.426 0.496 −0.696

1817.8 848 ACTGTCTTACTTTGATAAAA 992 49.49 −0.30 −1.595 −0.373 −1.131−0.809 1335.9 849 CTGTCTTACTTTGATAAAAC 993 49.49 −0.50 −1.595 −0.547−1.197 −0.916 1526.2 850 TGTCTTACTTTGATAAAACC 994 51.45 −0.50 −1.309−0.547 −1.019 −0.949 822.7 851 GTCTTACTTTGATAAAACCT 995 53.32 −0.50−1.034 −0.547 −0.849 −0.966 1227.4 852 TCTTACTTTGATAAAACCTC 996 51.75−0.50 −1.264 −0.547 −0.991 −0.946 503.0 853 CTTACTTTGATAAAACCTCC 99754.28 −0.50 −0.894 −0.547 −0.762 −0.910 1174.3 854 TTACTTTGATAAAACCTCCA998 53.70 −0.50 −0.978 −0.547 −0.814 −0.901 885.5 855TACTTTGATAAAACCTCCAA 999 51.79 −0.50 −1.259 −0.547 −0.988 −0.916 650.6856 ACTTTGATAAAACCTCCAAT 1000 52.29 −0.50 −1.185 −0.547 −0.943 −0.826615.4 857 CTTTGATAAAACCTCCAATT 1001 52.11 −0.50 −1.212 −0.547 −0.959

563.4 858 TTTGATAAAACCTCCAATTC 1002 51.46 −0.30 −1.307 −0.373 −0.952

420.9 859 TTGATAAAACCTCCAATTCC 1003 54.68 0.60 −0.834 0.410 −0.362

536.6 860 TGATAAAACCTCCAATTCCC 1004 57.79 0.60 −0.378 0.410 −0.079

1417.8 861 GATAAAACCTCCAATTCCCC 1005 61.15 1.00 0.114 0.757 0.359

4351.2 862 ATAAAACCTCCAATTCCCCC 1006 63.24 1.90 0.421 1.540 0.846

7738.7 863 TAAAACCTCCAATTCCCCCT 1007 64.88 1.90 0.663 1.540 0.996

11136.0 864 AAAACCTCCAATTCCCCCTA 1008 64.88 1.90 0.663 1.540 0.996 1.07414811.0 865 AAACCTCCAATTCCCCCTAT 1009 66.73 1.90 0.933 1.540 1.164 1.26115751.0 866 AACCTCCAATTCCCCCTATC 1010 70.07 1.80 1.424 1.453 1.435 1.33019661.0 867 ACCTCCAATTCCCCCTATCA 1011 73.21 1.80 1.883 1.453 1.720 1.33520301.0 868 CCTCCAATTCCCCCTATCAT 1012 72.64 1.80 1.801 1.453 1.669 1.32719376.0 869 CTCCAATTCCCCCTATCATT 1013 69.66 1.60 1.364 1.279 1.332 1.25417642.0 870 TCCAATTCCCCCTATCATTT 1014 68.21 1.10 1.150 0.844 1.034 1.09313751.0 871 CCAATTCCCCCTATCATTTT 1015 67.12 1.10 0.991 0.844 0.935 0.93112669.0 872 CAATTCCCCCTATCATTTTT 1016 64.02 1.10 0.536 0.844 0.653

9255.9 873 AATTCCCCCTATCATTTTTG 1017 62.80 0.40 0.357 0.236 0.311

8929.1 874 ATTCCCCCTATCATTTTTGG 1018 67.28 0.00 1.014 −0.112 0.586

6148.2 875 TTCCCCCTATCATTTTTGGT 1019 70.46 0.00 1.480 −0.112 0.875

5468.0 876 TCCCCCTATCATTTTTGGTT 1020 70.46 0.00 1.480 −0.112 0.875

5803.7 877 CCCCCTATCATTTTTGGTTT 1021 69.27 0.00 1.307 −0.112 0.768

5192.0 878 CCCCTATCATTTTTGGTTTC 1022 67.18 0.00 1.000 −0.112 0.577

3557.4 879 CCCTATCATTTTTGGTTTCC 1023 67.18 0.00 1.000 −0.112 0.577

5274.3 880 CCTATCATTTTTGGTTTCCA 1024 64.63 0.00 0.625 −0.112 0.345

3787.9 881 CTATCATTTTTGGTTTCCAT 1025 60.77 −0.50 0.059 −0.547 −0.171

2726.8 882 TATCATTTTTGGTTTCCATC 1026 60.20 −0.50 −0.025 −0.547 −0.223

3249.9 883 ATCATTTTTGGTTTCCATCT 1027 62.83 −0.50 0.361 −0.547 0.016

5548.9 884 TCATTTTTGGTTTCCATCTT 1028 63.21 −0.50 0.416 −0.547 0.050

5290.0 885 CATTTTTGGTTTCCATCTTC 1029 63.21 −0.50 0.416 −0.547 0.050

7451.0 886 ATTTTTGGTTTCCATCTTCC 1030 65.88 −0.50 0.809 −0.547 0.293

11578.0 887 TTTTTGGTTTCCATCTTCCT 1031 67.93 −0.50 1.109 −0.547 0.480

13722.0 888 TTTTGGTTTCCATCTTCCTG 1032 67.42 −0.50 1.035 −0.547 0.434

15064.0 889 TTTGGTTTCCATCTTCCTGG 1033 69.71 −0.90 1.370 −0.895 0.509

10869.0 890 TTGGTTTCCATCTTCCTGGC 1034 73.74 −1.30 1.962 −1.243 0.744

16035.0 891 TGGTTTCCATCTTCCTGGCA 1035 74.48 −1.30 2.071 −1.243 0.812

16304.0 892 GGTTTCCATCTTCCTGGCAA 1036 72.21 −1.30 1.737 −1.243 0.605

14885.0 893 GTTTCCATCTTCCTGGCAAA 1037 67.37 −1.30 1.027 −1.243 0.165

11910.0 894 TTTCCATCTTCCTGGCAAAC 1038 64.82 −1.30 0.653 −1.243 −0.067

11929.0 895 TTCCATCTTCCTGGCAAACT 1039 66.34 −1.30 0.877 −1.243 0.071

11517.0 896 TCCATCTTCCTGGCAAACTC 1040 67.47 −1.30 1.042 −1.243 0.174

11822.0 897 CCATCTTCCTGGCAAACTCA 1041 67.12 −0.90 0.991 −0.895 0.274

11710.0 898 CATCTTCCTGGCAAACTCAT 1042 63.55 0.90 0.466 0.670 0.544

7635.3 899 ATCTTCCTGGCAAACTCATT 1043 62.71 1.00 0.343 0.757 0.501

8378.2 900 TCTTCCTGGCAAACTCATTT 1044 63.06 0.90 0.395 0.670 0.500

6321.4 901 CTTCCTGGCAAACTCATTTC 1045 63.06 0.70 0.395 0.496 0.434

7659.0 902 TTCCTGGCAAACTCATTTCT 1046 63.06 0.70 0.395 0.496 0.434

11621.0 903 TCCTGGCAAACTCATTTCTT 1047 63.06 0.70 0.395 0.496 0.434

3389.0 904 CCTGGCAAACTCATTTCTTC 1048 63.06 0.70 0.395 0.496 0.434

3870.6 905 CTGGCAAACTCATTTCTTCT 1049 61.24 0.70 0.127 0.496 0.268

1992.7 906 TGGCAAACTCATTTCTTCTA 1050 58.74 0.70 −0.239 0.496 0.040

698.3 907 GGCAAACTCATTTCTTCTAA 1051 56.86 0.70 −0.514 0.496 −0.130

718.3 908 GCAAACTCATTTCTTCTAAT 1052 54.36 0.70 −0.882 0.496 −0.358

372.3 909 CAAACTCATTTCTTCTAATA 1053 49.93 0.60 −1.530 0.410 −0.793

180.6 910 AAACTCATTTCTTCTAATAC 1054 49.11 0.60 −1.651 0.410 −0.868

430.0 911 AACTCATTTCTTCTAATACT 1055 52.79 0.60 −1.111 0.410 −0.533

904.3 912 ACTCATTTCTTCTAATACTG 1056 54.63 0.60 −0.842 0.410 −0.366

1663.5 913 CTCATTTCTTCTAATACTGT 1057 57.14 0.60 −0.474 0.410 −0.138

2694.2 914 TCATTTCTTCTAATACTGTA 1058 54.51 0.60 −0.859 0.410 −0.377

3222.9 915 CATTTCTTCTAATACTGTAT 1059 53.21 0.60 −1.049 0.410 −0.495

3142.8 916 ATTTCTTCTAATACTGTATC 1060 53.13 0.80 −1.061 0.583 −0.436

5867.0 917 TTTCTTCTAATACTGTATCA 1061 54.51 1.20 −0.859 0.931 −0.179

6641.4 918 TTCTTCTAATACTGTATCAT 1062 54.17 1.30 −0.908 1.018 −0.176

7151.9 919 TCTTCTAATACTGTATCATC 1063 55.17 1.30 −0.762 1.018 −0.086

8134.9 920 CTTCTAATACTGTATCATCT 1064 55.86 1.30 −0.661 1.018 −0.023

8551.4 921 TTCTAATACTGTATCATCTG 1065 53.80 1.30 −0.964 1.018 −0.211

5741.7 922 TCTAATACTGTATCATCTGC 1066 57.65 1.30 −0.398 1.018 0.140

8575.9 923 CTAATACTGTATCATCTGCT 1067 58.28 1.30 −0.307 1.018 0.197

8980.3 924 TAATACTGTATCATCTGCTC 1068 57.65 1.30 −0.398 1.018 0.140

10762.0 925 AATACTGTATCATCTGCTCC 1069 62.19 1.30 0.268 1.018 0.553

17037.0 926 ATACTGTATCATCTGCTCCT 1070 66.43 1.30 0.889 1.018 0.938

20970.0 927 TACTGTATCATCTGCTCCTG 1071 66.32 1.30 0.874 1.018 0.929

23084.0 928 ACTGTATCATCTGCTCCTGT 1072 70.36 0.60 1.466 0.410 1.065 0.87524474.0 929 CTGTATCATCTGCTCCTGTA 1073 69.13 0.60 1.286 0.410 0.953 0.91022217.0 930 TGTATCATCTGCTCCTGTAT 1074 67.04 0.60 0.979 0.410 0.763 0.89019829.0 931 GTATCATCTGCTCCTGTATC 1075 68.85 0.60 1.244 0.410 0.927 0.84223548.0 932 TATCATCTGCTCCTGTATCT 1076 67.44 0.60 1.037 0.410 0.799

21759.0 933 ATCATCTGCTCCTGTATCTA 1077 67.44 0.60 1.037 0.410 0.799

22711.0 934 TCATCTGCTCCTGTATCTAA 1078 65.13 0.60 0.699 0.410 0.589

18134.0 935 CATCTGCTCCTGTATCTAAT 1079 63.60 1.00 0.475 0.757 0.582

17772.0 936 ATCTGCTCCTGTATCTAATA 1080 61.77 1.60 0.207 1.279 0.614

17134.0 937 TCTGCTCCTGTATCTAATAG 1081 62.01 1.60 0.241 1.279 0.635

10969.0 938 CTGCTCCTGTATCTAATAGA 1082 61.90 0.50 0.225 0.323 0.262

9556.3 939 TGCTCCTGTATCTAATAGAG 1083 60.12 0.30 −0.036 0.149 0.034

3739.9 940 GCTCCTGTATCTAATAGAGC 1084 64.50 −1.00 0.607 −0.982 0.003

4088.3 941 CTCCTGTATCTAATAGAGCT 1085 62.21 0.30 0.271 0.149 0.224

2263.0 942 TCCTGTATCTAATAGAGCTT 1086 60.56 0.30 0.028 0.149 0.074

1018.0 943 CCTGTATCTAATAGAGCTTC 1087 60.56 0.30 0.028 0.149 0.074

1319.1 944 CTGTATCTAATAGAGCTTCC 1088 60.56 0.30 0.028 0.149 0.074

2347.8 945 TGTATCTAATAGAGCTTCCT 1089 60.56 0.30 0.028 0.149 0.074

1871.6 946 GTATCTAATAGAGCTTCCTT 1090 61.00 0.30 0.092 0.149 0.114

3469.1 947 TATCTAATAGAGCTTCCTTT 1091 58.20 0.30 −0.318 0.149 −0.141

1114.6 948 ATCTAATAGAGCTTCCTTTA 1092 58.20 0.30 −0.318 0.149 −0.141

1358.4 949 TCTAATAGAGCTTCCTTTAG 1093 58.39 0.30 −0.289 0.149 −0.123

665.4 950 CTAATAGAGCTTCCTTTAGT 1094 60.12 0.00 −0.036 −0.112 −0.065

807.4 951 TAATAGAGCTTCCTTTAGTT 1095 58.46 0.30 −0.280 0.149 −0.117

608.7 952 AATAGAGCTTCCTTTAGTTG 1096 58.97 0.30 −0.205 0.149 −0.070

623.8 953 ATAGAGCTTCCTTTAGTTGC 1097 65.53 0.30 0.758 0.149 0.526

674.5 954 TAGAGCTTCCTTTAGTTGCC 1098 69.50 0.30 1.340 0.149 0.887 0.841814.3 955 AGAGCTTCCTTTAGTTGCCC 1099 73.89 0.30 1.983 0.149 1.286 1.1571183.8 956 GAGCTTCCTTTAGTTGCCCC 1100 77.20 0.30 2.470 0.149 1.588 1.4542219.4 957 AGCTTCCTTTAGTTGCCCCC 1101 79.38 0.30 2.789 0.149 1.785 1.6504642.2 958 GCTTCCTTTAGTTGCCCCCC 1102 82.41 0.40 3.234 0.236 2.095 1.7658804.8 959 CTTCCTTTAGTTGCCCCCCT 1103 80.06 0.80 2.889 0.583 2.013 1.82311331.0 960 TTCCTTTAGTTGCCCCCCTA 1104 77.67 1.10 2.539 0.844 1.895 1.81812976.0 961 TCCTTTAGTTGCCCCCCTAT 1105 77.27 0.60 2.480 0.410 1.693 1.76512369.0 962 CCTTTAGTTGCCCCCCTATC 1106 77.27 0.60 2.480 0.410 1.693 1.66915090.0 963 CTTTAGTTGCCCCCCTATCT 1107 75.74 0.60 2.255 0.410 1.554 1.58116130.0 964 TTTAGTTGCCCCCCTATCTT 1108 74.23 0.60 2.033 0.410 1.416 1.54515304.0 965 TTAGTTGCCCCCCTATCTTT 1109 74.23 0.60 2.033 0.410 1.416 1.53914829.0 966 TAGTTGCCCCCCTATCTTTA 1110 73.31 0.80 1.899 0.583 1.399 1.49015309.0 967 AGTTGCCCCCCTATCTTTAT 1111 73.83 1.40 1.976 1.105 1.645 1.49815205.0 968 GTTGCCCCCCTATCTTTATT 1112 73.91 1.40 1.986 1.105 1.652 1.52414192.0 969 TTGCCCCCCTATCTTTATTG 1113 70.59 1.40 1.500 1.105 1.350 1.5158699.5 970 TGCCCCCCTATCTTTATTGT 1114 73.39 1.40 1.911 1.105 1.605 1.4617786.6 971 GCCCCCCTATCTTTATTGTG 1115 73.39 1.40 1.911 1.105 1.605 1.3286709.1 972 CCCCCCTATCTTTATTGTGA 1116 70.61 1.40 1.502 1.105 1.351 1.1656198.4 973 CCCCCTATCTTTATTGTGAC 1117 67.66 1.20 1.070 0.931 1.017 0.9994910.2 974 CCCCTATCTTTATTGTGACG 1118 64.37 1.20 0.587 0.931 0.718

850.0 975 CCCTATCTTTATTGTGACGA 1119 62.05 1.20 0.248 0.931 0.507

404.9 976 CCTATCTTTATTGTGACGAG 1120 58.56 1.20 −0.265 0.931 0.190

166.6 977 CTATCTTTATTGTGACGAGG 1121 57.28 1.20 −0.452 0.931 0.073

126.9 978 TATCTTTATTGTGACGAGGG 1122 57.91 1.20 −0.361 0.931 0.130

92.6 979 ATCTTTATTGTGACGAGGGG 1123 61.03 1.20 0.097 0.931 0.414

97.9 980 TCTTTATTGTGACGAGGGGT 1124 64.18 0.90 0.559 0.670 0.601

122.3 981 CTTTATTGTGACGAGGGGTC 1125 64.18 −0.80 0.559 −0.808 0.039

267.0 982 TTTATTGTGACGAGGGGTCG 1126 62.63 −1.20 0.332 −1.156 −0.233

396.0 983 TTATTGTGACGAGGGGTCGT 1127 65.37 −2.30 0.734 −2.112 −0.348

446.0 984 TATTGTGACGAGGGGTCGTT 1128 65.37 −2.80 0.734 −2.547 −0.513

661.9 985 ATTGTGACGAGGGGTCGTTG 1129 65.82 −2.80 0.800 −2.547 −0.472

864.5 986 TTGTGACGAGGGGTCGTTGC 1130 70.01 −2.80 1.414 −2.547 −0.091

1465.7 987 TGTGACGAGGGGTCGTTGCC 1131 73.21 −2.80 1.884 −2.547 0.200

2836.9 988 GTGACGAGGGGTCGTTGCCA 1132 74.44 −2.80 2.065 −2.547 0.312

3589.7 989 TGACGAGGGGTCGTTGCCAA 1133 69.05 −2.80 1.274 −2.547 −0.178

2100.4 990 GACGAGGGGTCGTTGCCAAA 1134 67.10 −2.80 0.988 −2.547 −0.355

1948.7 991 ACGAGGGGTCGTTGCCAAAG 1135 66.13 −2.60 0.845 −2.373 −0.378

1384.3 992 CGAGGGGTCGTTGCCAAAGA 1136 66.81 −1.40 0.945 −1.330 0.081

1192.0 993 GAGGGGTCGTTGCCAAAGAG 1137 66.84 0.20 0.950 0.062 0.612

1221.0 994 AGGGGTCGTTGCCAAAGAGT 1138 68.70 0.20 1.223 0.062 0.782

953.2 995 GGGGTCGTTGCCAAAGAGTG 1139 68.32 0.20 1.167 0.062 0.747

988.6 996 GGGTCGTTGCCAAAGAGTGA 1140 67.11 0.20 0.989 0.062 0.636

937.8 997 GGTCGTTGCCAAAGAGTGAT 1141 64.59 0.50 0.620 0.323 0.507

852.1 998 GTCGTTGCCAAAGAGTGATC 1142 63.51 0.00 0.461 −0.112 0.243

1189.4 999 TCGTTGCCAAAGAGTGATCT 1143 62.35 −1.00 0.291 −0.982 −0.192

1501.7 1000 CGTTGCCAAAGAGTGATCTG 1144 60.92 −1.20 0.081 −1.156 −0.389

1360.9 1001 GTTGCCAAAGAGTGATCTGA 1145 61.71 −1.20 0.198 −1.156 −0.317

1112.9 1002 TTGCCAAAGAGTGATCTGAG 1146 58.90 −1.20 −0.215 −1.156 −0.572

468.3 1003 TGCCAAAGAGTGATCTGAGG 1147 61.08 −1.20 0.104 −1.156 −0.375

400.1 1004 GCCAAAGAGTGATCTGAGGG 1148 63.68 −1.50 0.485 −1.417 −0.237

401.6 1005 CCAAAGAGTGATCTGAGGGA 1149 60.94 −1.20 0.084 −1.156 −0.387

199.9 1006 CAAAGAGTGATCTGAGGGAA 1150 55.32 −1.20 −0.741 −1.156 −0.899

202.1 1007 AAAGAGTGATCTGAGGGAAG 1151 54.21 −1.20 −0.903 −1.156 −0.999

258.7 1008 AAGAGTGATCTGAGGGAAGT 1152 59.12 −1.20 −0.183 −1.156 −0.552

274.7 1009 AGAGTGATCTGAGGGAAGTT 1153 61.60 −1.00 0.181 −0.982 −0.261

297.2 1010 GAGTGATCTGAGGGAAGTTA 1154 60.78 −0.30 0.061 −0.373 −0.104

250.6 1011 AGTGATCTGAGGGAAGTTAA 1155 57.35 0.60 −0.443 0.410 −0.119

231.3 1012 GTGATCTGAGGGAAGTTAAA 1156 55.25 0.60 −0.751 0.410 −0.310

214.5 1013 TGATCTGAGGGAAGTTAAAG 1157 52.55 0.60 −1.147 0.410 −0.556

102.3 1014 GATCTGAGGGAAGTTAAAGG 1158 55.09 0.60 −0.774 0.410 −0.324

102.3 1015 ATCTGAGGGAAGTTAAAGGA 1159 55.09 0.60 −0.774 0.410 −0.324

49.4 1016 TCTGAGGGAAGTTAAAGGAT 1160 55.09 0.60 −0.774 0.410 −0.324

104.3 1017 CTGAGGGAAGTTAAAGGATA 1161 53.32 1.00 −1.034 0.757 −0.353

46.3 1018 TGAGGGAAGTTAAAGGATAC 1162 51.95 1.30 −1.235 1.018 −0.378

50.9 1019 GAGGGAAGTTAAAGGATACA 1163 53.26 0.90 −1.043 0.670 −0.392 58.21020 AGGGAAGTTAAAGGATACAG 1164 52.14 0.90 −1.207 0.670 −0.494 50.5 1021GGGAAGTTAAAGGATACAGT 1165 54.81 0.90 −0.815 0.670 −0.251 53.1

Example 3

[0239] Synopsis: The method of the present invention is particularlyuseful as a guide to the iterative refinement of probes. One of thespecific predictions made for rabbit β-globin in Example 1 is used toprovide an example of such a refinement.

[0240] Materials and Methods: The contig spanning positions 5-11 of aportion of the rabbit β-globin gene (Example 1, Table 3) was analyzed,using the experimentally measured data to simulate the results ofsuccessive experimental measurements. The iterative refinement wasperformed using a rule-based algorithm, outlined below. This algorithmis used by way of example only; other algorithms for efficiently findinglocal maxima are well known to the art and could be employed to performthis task.

[0241] Given experimental data for probes from the 1^(st) quartile,median and 3^(rd) quartile of a contig, as well as a user-set signalthreshold for further consideration of a probe,

[0242] 1) If all 3 measurements are below the user-specified signalthreshold, discard the prediction.

[0243] 2) If at least one of the measurements is above theuser-specified threshold, determine which point yields the maximumsignal.

[0244] a) If the maximum point is the 1^(st) quartile probe, then makethree new measurements for probes with the same spacing as that used inthe preceding iteration, but displaced so that the third probe isidentical to the original 1^(st) quartile probe. In other words, repeatthe search with the same pattern and spacing, but displace the patternin the direction of increasing signal found in the first experiment.

[0245] b) If the maximum point is the 3^(rd) quartile probe, then makethree new measurements for probes with the same spacing as that used inthe preceding iteration, but displaced so that the first probe isidentical to the original 3^(rd) quartile probe. In other words, repeatthe search with the same pattern and spacing, but displace the patternin the direction of increasing signal found in the first experiment.

[0246] c) If the maximum point is the median probe, then repeat theexperiment, keeping the median point the same, but shrinking the spacingbetween probes by a factor of 2.

[0247] 3) Continue iteration until a maximum is found, or the userjudges the signal level observed to be acceptable. Use the experimentalvalue measured for the probe duplicated in successive iterations to tietogether the successive data sets, via a simple normalization procedure,described below. Where appropriate, consider all of the data (i.e. allof the iterations) when deciding how to proceed, or whether the peakhybridization intensity has been found.

[0248] Results: Iterative refinement of the contig spanning positions5-11 in Table 3 proceeds as follows:

[0249] Iteration 1: Probes are synthesized at positions 6, 8 and 10,yielding the experimental hybridization intensities 180, 220 and 310,respectively.

[0250] Iteration 2: Following rule 2b), probes are synthesized atpositions 10, 12 and 14. Note that the redundant measurement at position10 serves as a bridge between experiments, and allows comparison of thetwo sets by normalizing the intensities by multiplying the seconditeration measurements by the ratio of the intensity observed for theprobe at position 10 in the first iteration to the value observed in thesecond iteration. In the simplest case, the ratio is 1; in any case, thesecond iteration yields the normalized values 310, 390, 240 for probepositions 10, 12 and 14, respectively.

[0251] Iteration 3: By rule 2c), measurements are performed for probesat positions 11, 12 and 13; after normalization, these yield thenormalized hybridization intensities 320, 390 and 410, respectively.Combination of these results with the results from iteration 2, probeposition 14, yields the conclusion that the best probe for thisintensity peak is the probe that starts at sequence position 13.

[0252] The overall result is that iterative improvement converges inthree iterations, and requires the synthesis of seven test probes, oneof which is the local optimal probe. In addition, the first and seconditerations yield probes that exhibit 75% and 95% of the local maximumhybridization intensities, respectively. In many applications, either ofthese probes would be considered acceptable.

[0253] The above examples 1 and 2 demonstrate that two differentimplementations of the method of the present invention are capable ofefficiently predicting regions of high hybridization efficiency in avariety of polynucleotide targets. Many of the predictions yieldacceptable probe sequences on the first design iteration, and all wouldyield optimized probe sets after 2-4 rounds of iterative refinement, asdemonstrated in Example 3. The performance demonstrated in theseexamples greatly exceeds the performance of current methods. Finally,the examples demonstrate that the predictions can be performed by asoftware application that has been implemented and installed on aPentium®-based computer workstation.

[0254] All publications and patent applications cited in thisspecification are herein incorporated by reference as if each individualpublication or patent application were specifically and individuallyindicated to be incorporated by reference.

[0255] Although the foregoing invention has been described in somedetail by way of illustration and example for purposes of clarity ofunderstanding, it will be readily apparent to those of ordinary skill inthe art in light of the teachings of this invention that certain changesand modifications may be made thereto without departing from the spiritor scope of the appended claims.

1 1165 24 base pairs nucleic acid single linear cDNA YES NO stem_loop2..21 1 ACTGGCAATC ACAATTGCCA GTAA 24 75 base pairs nucleic acid singlelinear tRNA NO NO Saccharomyces cerevisiae tRNA 1..75 experimental/function= “transfer RNA” /product= “tRNA-Ala” /evidence= EXPERIMENTAL/anticodon= (pos 34 .. 36, aa Ala) /citation= ([1][2]) modified_base 9experimental /evidence= EXPERIMENTAL /frequency= 0.9999 /mod_base= m1g/citation= ([1][2]) modified_base 16 experimental /evidence=EXPERIMENTAL /frequency= 0.9999 /mod_base= d /citation= ([1][2])modified_base 20 experimental /evidence= EXPERIMENTAL /frequency= 0.9999/mod_base= d /citation= ([1][2]) modified_base 26 experimental/evidence= EXPERIMENTAL /frequency= 0.9999 /mod_base= m22g /citation=([1][2]) modified_base 34 experimental /evidence= EXPERIMENTAL/frequency= 0.9999 /mod_base= i /citation= ([1][2]) modified_base 37experimental /evidence= EXPERIMENTAL /frequency= 0.9999 /mod_base= m1i/citation= ([1][2]) modified_base 38 experimental /evidence=EXPERIMENTAL /frequency= 0.9999 /mod_base= p /citation= ([1][2])modified_base 46 experimental /evidence= EXPERIMENTAL /frequency= 0.9999/mod_base= d /citation= ([1][2]) modified_base 53 experimental/evidence= EXPERIMENTAL /frequency= 0.9999 /mod_base= t /citation=([1][2]) modified_base 54 experimental /evidence= EXPERIMENTAL/frequency= 0.9999 /mod_base= p /citation= ([1][2]) R. W. Apgar, J.Everett, G. A. Madison, J. T. Marquisee, M. Merrill, S. H. Penswick, J.R. Zamir, A. Holley Structure of a ribonucleic acid Science 1471462-1465 1965 2 FROM 1 TO 75 J. R. Martin, R. Dirheimer, G. PenswickEvidence supporting a revised sequence for yeast alanine tRNA FEBS Lett.50 28-31 1975 2 FROM 1 TO 75 2 GGGCGUGUGG CGUAGUCGGU AGCGCGCUCCCUUGGCGUGG GAGAGUCUCC GGUUCGAUUC 60 CGGACUCGUC CACCA 75 16 base pairsnucleic acid single linear cDNA YES NO 3 ATGGACTTAG CATTCG 16 12 basepairs nucleic acid single linear cDNA YES NO 4 ATGGACTTAG CA 12 12 basepairs nucleic acid single linear cDNA YES NO 5 TGGACTTAGC AT 12 12 basepairs nucleic acid single linear cDNA YES NO 6 GGACTTAGCA TT 12 12 basepairs nucleic acid single linear cDNA YES NO 7 GACTTAGCAT TC 12 12 basepairs nucleic acid single linear cDNA YES NO 8 ACTTAGCATT CG 12 50 basepairs nucleic acid single linear cDNA YES NO 9 GTCCAAAAAG GGTCAGTCTACCTCCCGCCA TAAAAAACTC ATGTTCAAGA 50 25 base pairs nucleic acid singlelinear cDNA YES NO 10 GTCCAAAAAG GGTCAGTCTA CCTCC 25 25 base pairsnucleic acid single linear cDNA YES NO 11 TCCAAAAAGG GTCAGTCTAC CTCCC 2525 base pairs nucleic acid single linear cDNA YES NO 12 CCAAAAAGGGTCAGTCTACC TCCCG 25 25 base pairs nucleic acid single linear cDNA YES NO13 CAAAAAGGGT CAGTCTACCT CCCGC 25 25 base pairs nucleic acid singlelinear cDNA YES NO 14 AAAAAGGGTC AGTCTACCTC CCGCC 25 25 base pairsnucleic acid single linear cDNA YES NO 15 AAAAGGGTCA GTCTACCTCC CGCCA 2525 base pairs nucleic acid single linear cDNA YES NO 16 AAAGGGTCAGTCTACCTCCC GCCAT 25 25 base pairs nucleic acid single linear cDNA YES NO17 AAGGGTCAGT CTACCTCCCG CCATA 25 25 base pairs nucleic acid singlelinear cDNA YES NO 18 AGGGTCAGTC TACCTCCCGC CATAA 25 25 base pairsnucleic acid single linear cDNA YES NO 19 GGGTCAGTCT ACCTCCCGCC ATAAA 2525 base pairs nucleic acid single linear cDNA YES NO 20 GGTCAGTCTACCTCCCGCCA TAAAA 25 25 base pairs nucleic acid single linear cDNA YES NO21 GTCAGTCTAC CTCCCGCCAT AAAAA 25 25 base pairs nucleic acid singlelinear cDNA YES NO 22 TCAGTCTACC TCCCGCCATA AAAAA 25 25 base pairsnucleic acid single linear cDNA YES NO 23 CAGTCTACCT CCCGCCATAA AAAAC 2525 base pairs nucleic acid single linear cDNA YES NO 24 AGTCTACCTCCCGCCATAAA AAACT 25 25 base pairs nucleic acid single linear cDNA YES NO25 GTCTACCTCC CGCCATAAAA AACTC 25 25 base pairs nucleic acid singlelinear cDNA YES NO 26 TCTACCTCCC GCCATAAAAA ACTCA 25 25 base pairsnucleic acid single linear cDNA YES NO 27 CTACCTCCCG CCATAAAAAA CTCAT 2525 base pairs nucleic acid single linear cDNA YES NO 28 TACCTCCCGCCATAAAAAAC TCATG 25 25 base pairs nucleic acid single linear cDNA YES NO29 ACCTCCCGCC ATAAAAAACT CATGT 25 25 base pairs nucleic acid singlelinear cDNA YES NO 30 CCTCCCGCCA TAAAAAACTC ATGTT 25 25 base pairsnucleic acid single linear cDNA YES NO 31 CTCCCGCCAT AAAAAACTCA TGTTC 2525 base pairs nucleic acid single linear cDNA YES NO 32 TCCCGCCATAAAAAACTCAT GTTCA 25 25 base pairs nucleic acid single linear cDNA YES NO33 CCCGCCATAA AAAACTCATG TTCAA 25 25 base pairs nucleic acid singlelinear cDNA YES NO 34 CCGCCATAAA AAACTCATGT TCAAG 25 25 base pairsnucleic acid single linear cDNA YES NO 35 CGCCATAAAA AACTCATGTT CAAGA 25122 base pairs nucleic acid single linear cDNA NO NO Oryctolaguscuniculus 5′UTR 1..53 CDS 54..122 /codon_start= 54 /product= “rabbitbeta1 globin, N-terminus” /citation= ([1]) M. L. III Johnson, J. E.James, M. D. Hardison, R. C. Rohrbaugh Transcriptional unit of therabbit beta1 globin gene Mol. Cell. Biol. 5 147-160 1985 36 FROM 1 TO122 36 ACACTTGCTT TTGACACAAC TGTGTTTACT TGCAATCCCC CAAAACAGAC AGA ATG 56Met 1 GTG CAT CTG TCC AGT GAG GAG AAG TCT GCG GTC ACT GCC CTG TGG GGC104 Val His Leu Ser Ser Glu Glu Lys Ser Ala Val Thr Ala Leu Trp Gly 5 1015 AAG GTG AAT GTG GAA GAA 122 Lys Val Asn Val Glu Glu 20 1040 basepairs nucleic acid single linear cDNA NO NO Human immunodefficiencyvirus type I BH10 misc_RNA 1..1040 experimental /partial /function=“protease & reverse transcriptase regions” /product= “pol polyprotein(partial)” /evidence= EXPERIMENTAL /citation= ([1]) F. Gallo, R. C.Chang, N. T. Ghrayeb, J. Papas, T. S. Lautenberger, J. A. Pearson, M. L.Jr. Petteway, S. R. Ivanoff, L. Baumeister, K. Wong-Stahl Completenucleotide sequence of the AIDS virus, HTLV-III Nature 313 277-284 198537 FROM 1 TO 1040 37 TGTACTGTCC ATTTATCAGG ATGGAGTTCA TAACCCATCCAAAGGAATGG AGGTTCTTTC 60 TGATGTTTTT TGTCTGGTGT GGTAAGTCCC CACCTCAACAGATGTTGTCT CAGCTCCTCT 120 ATTTTTGTTC TATGCTGCCC TATTTCTAAG TCAGATCCTACATACAAATC ATCCATGTAT 180 TGATAGATAA CTATGTCTGG ATTTTGTTTT TTAAAAGGCTCTAAGATTTT TGTCATGCTA 240 CTTTGGAATA TTGCTGGTGA TCCTTTCCAT CCCTGTGGAAGCACATTGTA CTGATATCTA 300 ATCCCTGGTG TCTCATTGTT TATACTAGGT ATGGTAAATGCAGTATACTT CCTGAAGTCT 360 TCATCTAAGG GAACTGAAAA ATATGCATCA CCCACATCCAGTACTGTTAC TGATTTTTTC 420 TTTTTTAACC CTGCGGGATG TGGTATTCCT AATTGAACTTCCCAGAAGTC TTGAGTTCTC 480 TTATTAAGTT CTCTGAAATC TACTAATTTT CTCCATTTAGTACTGTCTTT TTTCTTTATG 540 GCAAATACTG GAGTATTGTA TGGATTCTCA GGCCCAATTTTTGAAATTTT CCCTTCCTTT 600 TCCATTTCTG TACAAATTTC TACTAATGCT TTTATTTTTTCTTCTGTCAA TGGCCATTGT 660 TTAACTTTTG GGCCATCCAT TCCTGGCTTT AATTTTACTGGTACAGTCTC AATAGGGCTA 720 ATGGGAAAAT TTAAAGTGCA ACCAATCTGA GTCAACAGATTTCTTCCAAT TATGTTGACA 780 GGTGTAGGTC CTACTAATAC TGTACCTATA GCTTTATGTCCACAGATTTC TATGAGTATC 840 TGATCATACT GTCTTACTTT GATAAAACCT CCAATTCCCCCTATCATTTT TGGTTTCCAT 900 CTTCCTGGCA AACTCATTTC TTCTAATACT GTATCATCTGCTCCTGTATC TAATAGAGCT 960 TCCTTTAGTT GCCCCCCTAT CTTTATTGTG ACGAGGGGTCGTTGCCAAAG AGTGATCTGA 1020 GGGAAGTTAA AGGATACAGT 1040 999 base pairsnucleic acid single linear cDNA NO NO Homo sapiens CDS 1..982experimental /partial /codon_start= 2 /function= “glycolysis” /product=“Glyceraldehydephosphate Dehydrogenase” /evidence= EXPERIMENTAL/standard_name= “G3PDH” /citation= ([1]) promoter 983..999 /function=“promoter for T7 RNA polymerase” P. Martinelli, R. Salvatore, F. ArcariThe complete sequence of a full length cDNA for human liverglyceraldehyde-3-phosphate dehydrogenase evidence for multiple mRNAspecies Nucleic Acids Res. 12 23 9179-9189 1984 38 FROM 1 TO 999 38 GAAG GTC GGA GTC AAC GGA TTT GGT CGT ATT GGG CGC CTG GTC ACC 46 Lys ValGly Val Asn Gly Phe Gly Arg Ile Gly Arg Leu Val Thr 1 5 10 15 AGG GCTGCT TTT AAC TCT GGT AAA GTG GAT ATT GTT GCC ATC AAT GAC 94 Arg Ala AlaPhe Asn Ser Gly Lys Val Asp Ile Val Ala Ile Asn Asp 20 25 30 CCC TTC ATTGAC CTC AAC TAC ATG GTT TAC ATG TTC CAA TAT GAT TCC 142 Pro Phe Ile AspLeu Asn Tyr Met Val Tyr Met Phe Gln Tyr Asp Ser 35 40 45 ACC CAT GGC AAATTC CAT GGC ACC GTC AAG GCT GAG AAC GGG AAG CTT 190 Thr His Gly Lys PheHis Gly Thr Val Lys Ala Glu Asn Gly Lys Leu 50 55 60 GTC ATC AAT GGA AATCCC ATC ACC ATC TTC CAG GAG CGA GAT CCC TCC 238 Val Ile Asn Gly Asn ProIle Thr Ile Phe Gln Glu Arg Asp Pro Ser 65 70 75 AAA ATC AAG TGG GGC GATGCT GGC GCT GAG TAC GTC GTG GAG TCC ACT 286 Lys Ile Lys Trp Gly Asp AlaGly Ala Glu Tyr Val Val Glu Ser Thr 80 85 90 95 GGC GTC TTC ACC ACC ATGGAG AAG GCT GGG GCT CAT TTG CAG GGG GGA 334 Gly Val Phe Thr Thr Met GluLys Ala Gly Ala His Leu Gln Gly Gly 100 105 110 GCC AAA AGG GTC ATC ATCTCT GCC CCC TCT GCT GAT GCC CCC ATG TTC 382 Ala Lys Arg Val Ile Ile SerAla Pro Ser Ala Asp Ala Pro Met Phe 115 120 125 GTC ATG GGT GTG AAC CATGAG AAG TAT GAC AAC AGC CTC AAG ATC ATC 430 Val Met Gly Val Asn His GluLys Tyr Asp Asn Ser Leu Lys Ile Ile 130 135 140 AGC AAT GCC TCC TGC ACCACC AAC TGC TTA GCA CCC CTG GCC AAG GTC 478 Ser Asn Ala Ser Cys Thr ThrAsn Cys Leu Ala Pro Leu Ala Lys Val 145 150 155 ATC CAT GAC AAC TTT GGTATC GTG GAA GGA CTC ATG ACC ACA GTC CAT 526 Ile His Asp Asn Phe Gly IleVal Glu Gly Leu Met Thr Thr Val His 160 165 170 175 GCC ATC ACT GCC ACCCAG AAG ACT GTG GAT GGC CCC TCC GGG AAA CTG 574 Ala Ile Thr Ala Thr GlnLys Thr Val Asp Gly Pro Ser Gly Lys Leu 180 185 190 TGG CGT GAT GGC CGCGGG GCT CTC CAG AAC ATC ATC CCT GCC TCT ACT 622 Trp Arg Asp Gly Arg GlyAla Leu Gln Asn Ile Ile Pro Ala Ser Thr 195 200 205 GGC GCT GCC AAG GCTGTG GGC AAG GTC ATC CCT GAG CTA GAC GGG AAG 670 Gly Ala Ala Lys Ala ValGly Lys Val Ile Pro Glu Leu Asp Gly Lys 210 215 220 CTC ACT GGC ATG GCCTTC CGT GTC CCC ACT GCC AAC GTG TCA GTG GTG 718 Leu Thr Gly Met Ala PheArg Val Pro Thr Ala Asn Val Ser Val Val 225 230 235 GAC CTG ACC TGC CGTCTA GAA AAA CCT GCC AAA TAT GAT GAC ATC AAG 766 Asp Leu Thr Cys Arg LeuGlu Lys Pro Ala Lys Tyr Asp Asp Ile Lys 240 245 250 255 AAG GTG GTG AAGCAG GCG TCG GAG GGC CCC CTC AAA GGC ATC CTG GGC 814 Lys Val Val Lys GlnAla Ser Glu Gly Pro Leu Lys Gly Ile Leu Gly 260 265 270 TAC ACT GAG CACCAG GTG GTC TCC TCT GAC TTC AAC AGC GAC ACC CAC 862 Tyr Thr Glu His GlnVal Val Ser Ser Asp Phe Asn Ser Asp Thr His 275 280 285 TCC TCC ACC TTTGAC GCT GGG GCT GGC ATT GCC CTC AAC GAC CAC TTT 910 Ser Ser Thr Phe AspAla Gly Ala Gly Ile Ala Leu Asn Asp His Phe 290 295 300 GTC AAG CTC ATTTCC TGG TAT GAC AAC GAA TTT GGC TAC AGC AAC AGG 958 Val Lys Leu Ile SerTrp Tyr Asp Asn Glu Phe Gly Tyr Ser Asn Arg 305 310 315 GTG GTG GAC CTCATG GCC CAC ATG CTATAGTGAG TCGTATT 999 Val Val Asp Leu Met Ala His Met320 325 1049 base pairs nucleic acid single linear cDNA NO NO Homosapiens CDS 1..372 experimental /partial /codon_start= 1 /function=“tumor suppressor” /product= “p53 (C-terminal portion)” /evidence=EXPERIMENTAL /gene= “HSP53G” /standard_name= “p53” 3′UTR 373..1049/citation= ([1]) P. A. Barrett, J. C. Wiseman, R. W. Futreal An Alupolymorphism intragenic to the TP53 gene Nucleic Acids Res. 19 24 6977-1991 39 FROM 1 TO 1049 39 GAG GTG CGT GTT TGT GCC TGT CCT GGG AGA GACCGG CGC ACA GAG GAA 48 Glu Val Arg Val Cys Ala Cys Pro Gly Arg Asp ArgArg Thr Glu Glu 1 5 10 15 GAG AAT CTC CGC AAG AAA GGG GAG CCT CAC CACGAG CTG CCC CCA GGG 96 Glu Asn Leu Arg Lys Lys Gly Glu Pro His His GluLeu Pro Pro Gly 20 25 30 AGC ACT AAG CGA GCA CTG CCC AAC AAC ACC AGC TCCTCT CCC CAG CCA 144 Ser Thr Lys Arg Ala Leu Pro Asn Asn Thr Ser Ser SerPro Gln Pro 35 40 45 AAG AAG AAA CCA CTG GAT GGA GAA TAT TTC ACC CTT CAGATC CGT GGG 192 Lys Lys Lys Pro Leu Asp Gly Glu Tyr Phe Thr Leu Gln IleArg Gly 50 55 60 CGT GAG CGC TTC GAG ATG TTC CGA GAG CTG AAT GAG GCC TTGGAA CTC 240 Arg Glu Arg Phe Glu Met Phe Arg Glu Leu Asn Glu Ala Leu GluLeu 65 70 75 80 AAG GAT GCC CAG GCT GGG AAG GAG CCA GGG GGG AGC AGG GCTCAC TCC 288 Lys Asp Ala Gln Ala Gly Lys Glu Pro Gly Gly Ser Arg Ala HisSer 85 90 95 AGC CAC CTG AAG TCC AAA AAG GGT CAG TCT ACC TCC CGC CAT AAAAAA 336 Ser His Leu Lys Ser Lys Lys Gly Gln Ser Thr Ser Arg His Lys Lys100 105 110 CTC ATG TTC AAG ACA GAA GGG CCT GAC TCA GAC TGA CATTCTCCAC382 Leu Met Phe Lys Thr Glu Gly Pro Asp Ser Asp * 115 120 TTCTTGTTCCCCACTGACAG CCTCCCTCCC CCATCTCTCC CTCCCCTGCC ATTTTGGGTT 442 TTGGGTCTTTGAACCCTTGC TTGCAATAGG TGTGCGTCAG AAGCACCCAG GACTTCCATT 502 TGCTTTGTCCCGGGGCTCCA CTGAACAAGT TGGCCTGCAC TGGTGTTTTG TTGTGGGGAG 562 GAGGATGGGGAGTAGGACAT ACCAGCTTAG ATTTTAAGGT TTTTACTGTG AGGGATGTTT 622 GGGAGATGTAAGAAATGTTC TTGCAGTTAA GGGTTAGTTT ACAATCAGCC ACATTCTAGG 682 TAGGTAGGGGCCCACTTCAC CGTACTAACC AGGGAAGCTG TCCCTCATGT TGAATTTTCT 742 CTAACTTCAAGGCCCATATC TGTGAAATGC TGGCATTTGC ACCTACCTCA CAGAGTGCAT 802 TGTGAGGGTTAATGAAATAA TGTACATCTG GCCTTGAAAC CACCTTTTAT TACATGGGGT 862 CTAAAACTTGACCCCCTTGA GGGTGCCTGT TCCCTCTCCC TCTCCCTGTT GGCTGGTGGG 922 TTGGTAGTTTCTACAGTTGG GCAGCTGGTT AGGTAGAGGG AGTTGTCAAG TCTTGCTGGC 982 CCAGCCAAACCCTGTCTGAC AACCTCTTGG TCGACCTTAG TACCTAAAAG GAAATCTCAC 1042 CCCATCC 104917 base pairs nucleic acid single linear cDNA NO NO 40 TTCTTCCACATTCACCT 17 17 base pairs nucleic acid single linear cDNA NO NO 41TCTTCCACAT TCACCTT 17 17 base pairs nucleic acid single linear cDNA NONO 42 CTTCCACATT CACCTTG 17 17 base pairs nucleic acid single linearcDNA NO NO 43 TTCCACATTC ACCTTGC 17 17 base pairs nucleic acid singlelinear cDNA NO NO 44 TCCACATTCA CCTTGCC 17 17 base pairs nucleic acidsingle linear cDNA NO NO 45 CCACATTCAC CTTGCCC 17 17 base pairs nucleicacid single linear cDNA NO NO 46 CACATTCACC TTGCCCC 17 17 base pairsnucleic acid single linear cDNA NO NO 47 ACATTCACCT TGCCCCA 17 17 basepairs nucleic acid single linear cDNA NO NO 48 CATTCACCTT GCCCCAC 17 17base pairs nucleic acid single linear cDNA NO NO 49 ATTCACCTTG CCCCACA17 17 base pairs nucleic acid single linear cDNA NO NO 50 TTCACCTTGCCCCACAG 17 17 base pairs nucleic acid single linear cDNA NO NO 51TCACCTTGCC CCACAGG 17 17 base pairs nucleic acid single linear cDNA NONO 52 CACCTTGCCC CACAGGG 17 17 base pairs nucleic acid single linearcDNA NO NO 53 ACCTTGCCCC ACAGGGC 17 17 base pairs nucleic acid singlelinear cDNA NO NO 54 CCTTGCCCCA CAGGGCA 17 17 base pairs nucleic acidsingle linear cDNA NO NO 55 CTTGCCCCAC AGGGCAG 17 17 base pairs nucleicacid single linear cDNA NO NO 56 TTGCCCCACA GGGCAGT 17 17 base pairsnucleic acid single linear cDNA NO NO 57 TGCCCCACAG GGCAGTG 17 17 basepairs nucleic acid single linear cDNA NO NO 58 GCCCCACAGG GCAGTGA 17 17base pairs nucleic acid single linear cDNA NO NO 59 CCCCACAGGG CAGTGAC17 17 base pairs nucleic acid single linear cDNA NO NO 60 CCCACAGGGCAGTGACC 17 17 base pairs nucleic acid single linear cDNA NO NO 61CCACAGGGCA GTGACCG 17 17 base pairs nucleic acid single linear cDNA NONO 62 CACAGGGCAG TGACCGC 17 17 base pairs nucleic acid single linearcDNA NO NO 63 ACAGGGCAGT GACCGCA 17 17 base pairs nucleic acid singlelinear cDNA NO NO 64 CAGGGCAGTG ACCGCAG 17 17 base pairs nucleic acidsingle linear cDNA NO NO 65 AGGGCAGTGA CCGCAGA 17 17 base pairs nucleicacid single linear cDNA NO NO 66 GGGCAGTGAC CGCAGAC 17 17 base pairsnucleic acid single linear cDNA NO NO 67 GGCAGTGACC GCAGACT 17 17 basepairs nucleic acid single linear cDNA NO NO 68 GCAGTGACCG CAGACTT 17 17base pairs nucleic acid single linear cDNA NO NO 69 CAGTGACCGC AGACTTC17 17 base pairs nucleic acid single linear cDNA NO NO 70 AGTGACCGCAGACTTCT 17 17 base pairs nucleic acid single linear cDNA NO NO 71GTGACCGCAG ACTTCTC 17 17 base pairs nucleic acid single linear cDNA NONO 72 TGACCGCAGA CTTCTCC 17 17 base pairs nucleic acid single linearcDNA NO NO 73 GACCGCAGAC TTCTCCT 17 17 base pairs nucleic acid singlelinear cDNA NO NO 74 ACCGCAGACT TCTCCTC 17 17 base pairs nucleic acidsingle linear cDNA NO NO 75 CCGCAGACTT CTCCTCA 17 17 base pairs nucleicacid single linear cDNA NO NO 76 CGCAGACTTC TCCTCAC 17 17 base pairsnucleic acid single linear cDNA NO NO 77 GCAGACTTCT CCTCACT 17 17 basepairs nucleic acid single linear cDNA NO NO 78 CAGACTTCTC CTCACTG 17 17base pairs nucleic acid single linear cDNA NO NO 79 AGACTTCTCC TCACTGG17 17 base pairs nucleic acid single linear cDNA NO NO 80 GACTTCTCCTCACTGGA 17 17 base pairs nucleic acid single linear cDNA NO NO 81ACTTCTCCTC ACTGGAC 17 17 base pairs nucleic acid single linear cDNA NONO 82 CTTCTCCTCA CTGGACA 17 17 base pairs nucleic acid single linearcDNA NO NO 83 TTCTCCTCAC TGGACAG 17 17 base pairs nucleic acid singlelinear cDNA NO NO 84 TCTCCTCACT GGACAGA 17 17 base pairs nucleic acidsingle linear cDNA NO NO 85 CTCCTCACTG GACAGAT 17 17 base pairs nucleicacid single linear cDNA NO NO 86 TCCTCACTGG ACAGATG 17 17 base pairsnucleic acid single linear cDNA NO NO 87 CCTCACTGGA CAGATGC 17 17 basepairs nucleic acid single linear cDNA NO NO 88 CTCACTGGAC AGATGCA 17 17base pairs nucleic acid single linear cDNA NO NO 89 TCACTGGACA GATGCAC17 17 base pairs nucleic acid single linear cDNA NO NO 90 CACTGGACAGATGCACC 17 17 base pairs nucleic acid single linear cDNA NO NO 91ACTGGACAGA TGCACCA 17 17 base pairs nucleic acid single linear cDNA NONO 92 CTGGACAGAT GCACCAT 17 17 base pairs nucleic acid single linearcDNA NO NO 93 TGGACAGATG CACCATT 17 17 base pairs nucleic acid singlelinear cDNA NO NO 94 GGACAGATGC ACCATTC 17 17 base pairs nucleic acidsingle linear cDNA NO NO 95 GACAGATGCA CCATTCT 17 17 base pairs nucleicacid single linear cDNA NO NO 96 ACAGATGCAC CATTCTG 17 17 base pairsnucleic acid single linear cDNA NO NO 97 CAGATGCACC ATTCTGT 17 17 basepairs nucleic acid single linear cDNA NO NO 98 AGATGCACCA TTCTGTC 17 17base pairs nucleic acid single linear cDNA NO NO 99 GATGCACCAT TCTGTCT17 17 base pairs nucleic acid single linear cDNA NO NO 100 ATGCACCATTCTGTCTG 17 17 base pairs nucleic acid single linear cDNA NO NO 101TGCACCATTC TGTCTGT 17 17 base pairs nucleic acid single linear cDNA NONO 102 GCACCATTCT GTCTGTT 17 17 base pairs nucleic acid single linearcDNA NO NO 103 CACCATTCTG TCTGTTT 17 17 base pairs nucleic acid singlelinear cDNA NO NO 104 ACCATTCTGT CTGTTTT 17 17 base pairs nucleic acidsingle linear cDNA NO NO 105 CCATTCTGTC TGTTTTG 17 17 base pairs nucleicacid single linear cDNA NO NO 106 CATTCTGTCT GTTTTGG 17 17 base pairsnucleic acid single linear cDNA NO NO 107 ATTCTGTCTG TTTTGGG 17 17 basepairs nucleic acid single linear cDNA NO NO 108 TTCTGTCTGT TTTGGGG 17 17base pairs nucleic acid single linear cDNA NO NO 109 TCTGTCTGTT TTGGGGG17 17 base pairs nucleic acid single linear cDNA NO NO 110 CTGTCTGTTTTGGGGGA 17 17 base pairs nucleic acid single linear cDNA NO NO 111TGTCTGTTTT GGGGGAT 17 17 base pairs nucleic acid single linear cDNA NONO 112 GTCTGTTTTG GGGGATT 17 17 base pairs nucleic acid single linearcDNA NO NO 113 TCTGTTTTGG GGGATTG 17 17 base pairs nucleic acid singlelinear cDNA NO NO 114 CTGTTTTGGG GGATTGC 17 17 base pairs nucleic acidsingle linear cDNA NO NO 115 TGTTTTGGGG GATTGCA 17 17 base pairs nucleicacid single linear cDNA NO NO 116 GTTTTGGGGG ATTGCAA 17 17 base pairsnucleic acid single linear cDNA NO NO 117 TTTTGGGGGA TTGCAAG 17 17 basepairs nucleic acid single linear cDNA NO NO 118 TTTGGGGGAT TGCAAGT 17 17base pairs nucleic acid single linear cDNA NO NO 119 TTGGGGGATT GCAAGTA17 17 base pairs nucleic acid single linear cDNA NO NO 120 TGGGGGATTGCAAGTAA 17 17 base pairs nucleic acid single linear cDNA NO NO 121GGGGGATTGC AAGTAAA 17 17 base pairs nucleic acid single linear cDNA NONO 122 GGGGATTGCA AGTAAAC 17 17 base pairs nucleic acid single linearcDNA NO NO 123 GGGATTGCAA GTAAACA 17 17 base pairs nucleic acid singlelinear cDNA NO NO 124 GGATTGCAAG TAAACAC 17 17 base pairs nucleic acidsingle linear cDNA NO NO 125 GATTGCAAGT AAACACA 17 17 base pairs nucleicacid single linear cDNA NO NO 126 ATTGCAAGTA AACACAG 17 17 base pairsnucleic acid single linear cDNA NO NO 127 TTGCAAGTAA ACACAGT 17 17 basepairs nucleic acid single linear cDNA NO NO 128 TGCAAGTAAA CACAGTT 17 17base pairs nucleic acid single linear cDNA NO NO 129 GCAAGTAAAC ACAGTTG17 17 base pairs nucleic acid single linear cDNA NO NO 130 CAAGTAAACACAGTTGT 17 17 base pairs nucleic acid single linear cDNA NO NO 131AAGTAAACAC AGTTGTG 17 17 base pairs nucleic acid single linear cDNA NONO 132 AGTAAACACA GTTGTGT 17 17 base pairs nucleic acid single linearcDNA NO NO 133 GTAAACACAG TTGTGTC 17 17 base pairs nucleic acid singlelinear cDNA NO NO 134 TAAACACAGT TGTGTCA 17 17 base pairs nucleic acidsingle linear cDNA NO NO 135 AAACACAGTT GTGTCAA 17 17 base pairs nucleicacid single linear cDNA NO NO 136 AACACAGTTG TGTCAAA 17 17 base pairsnucleic acid single linear cDNA NO NO 137 ACACAGTTGT GTCAAAA 17 17 basepairs nucleic acid single linear cDNA NO NO 138 CACAGTTGTG TCAAAAG 17 17base pairs nucleic acid single linear cDNA NO NO 139 ACAGTTGTGT CAAAAGC17 17 base pairs nucleic acid single linear cDNA NO NO 140 CAGTTGTGTCAAAAGCA 17 17 base pairs nucleic acid single linear cDNA NO NO 141AGTTGTGTCA AAAGCAA 17 17 base pairs nucleic acid single linear cDNA NONO 142 GTTGTGTCAA AAGCAAG 17 17 base pairs nucleic acid single linearcDNA NO NO 143 TTGTGTCAAA AGCAAGT 17 17 base pairs nucleic acid singlelinear cDNA NO NO 144 TGTGTCAAAA GCAAGTG 17 20 base pairs nucleic acidsingle linear cDNA NO NO 145 GTACTGTCCA TTTATCAGGA 20 20 base pairsnucleic acid single linear cDNA NO NO 146 TACTGTCCAT TTATCAGGAT 20 20base pairs nucleic acid single linear cDNA NO NO 147 ACTGTCCATTTATCAGGATG 20 20 base pairs nucleic acid single linear cDNA NO NO 148CTGTCCATTT ATCAGGATGG 20 20 base pairs nucleic acid single linear cDNANO NO 149 TGTCCATTTA TCAGGATGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 150 GTCCATTTAT CAGGATGGAG 20 20 base pairs nucleicacid single linear cDNA NO NO 151 TCCATTTATC AGGATGGAGT 20 20 base pairsnucleic acid single linear cDNA NO NO 152 CCATTTATCA GGATGGAGTT 20 20base pairs nucleic acid single linear cDNA NO NO 153 CATTTATCAGGATGGAGTTC 20 20 base pairs nucleic acid single linear cDNA NO NO 154ATTTATCAGG ATGGAGTTCA 20 20 base pairs nucleic acid single linear cDNANO NO 155 TTTATCAGGA TGGAGTTCAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 156 TTATCAGGAT GGAGTTCATA 20 20 base pairs nucleicacid single linear cDNA NO NO 157 TATCAGGATG GAGTTCATAA 20 20 base pairsnucleic acid single linear cDNA NO NO 158 ATCAGGATGG AGTTCATAAC 20 20base pairs nucleic acid single linear cDNA NO NO 159 TCAGGATGGAGTTCATAACC 20 20 base pairs nucleic acid single linear cDNA NO NO 160CAGGATGGAG TTCATAACCC 20 20 base pairs nucleic acid single linear cDNANO NO 161 AGGATGGAGT TCATAACCCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 162 GGATGGAGTT CATAACCCAT 20 20 base pairs nucleicacid single linear cDNA NO NO 163 GATGGAGTTC ATAACCCATC 20 20 base pairsnucleic acid single linear cDNA NO NO 164 ATGGAGTTCA TAACCCATCC 20 20base pairs nucleic acid single linear cDNA NO NO 165 TGGAGTTCATAACCCATCCC 20 20 base pairs nucleic acid single linear cDNA NO NO 166GGAGTTCATA ACCCATCCCA 20 20 base pairs nucleic acid single linear cDNANO NO 167 GAGTTCATAA CCCATCCCAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 168 AGTTCATAAC CCATCCCAAA 20 20 base pairs nucleicacid single linear cDNA NO NO 169 GTTCATAACC CATCCCAAAG 20 20 base pairsnucleic acid single linear cDNA NO NO 170 TTCATAACCC ATCCCAAAGG 20 20base pairs nucleic acid single linear cDNA NO NO 171 TCATAACCCATCCCAAAGGA 20 20 base pairs nucleic acid single linear cDNA NO NO 172CATAACCCAT CCCAAAGGAA 20 20 base pairs nucleic acid single linear cDNANO NO 173 ATAACCCATC CCAAAGGAAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 174 TAACCCATCC CAAAGGAATG 20 20 base pairs nucleicacid single linear cDNA NO NO 175 AACCCATCCC AAAGGAATGG 20 20 base pairsnucleic acid single linear cDNA NO NO 176 ACCCATCCCA AAGGAATGGA 20 20base pairs nucleic acid single linear cDNA NO NO 177 CCCATCCCAAAGGAATGGAG 20 20 base pairs nucleic acid single linear cDNA NO NO 178CCATCCCAAA GGAATGGAGG 20 20 base pairs nucleic acid single linear cDNANO NO 179 CATCCCAAAG GAATGGAGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 180 ATCCCAAAGG AATGGAGGTT 20 20 base pairs nucleicacid single linear cDNA NO NO 181 TCCCAAAGGA ATGGAGGTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 182 CCCAAAGGAA TGGAGGTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 183 CCAAAGGAATGGAGGTTCTT 20 20 base pairs nucleic acid single linear cDNA NO NO 184CAAAGGAATG GAGGTTCTTT 20 20 base pairs nucleic acid single linear cDNANO NO 185 AAAGGAATGG AGGTTCTTTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 186 AAGGAATGGA GGTTCTTTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 187 AGGAATGGAG GTTCTTTCTG 20 20 base pairsnucleic acid single linear cDNA NO NO 188 GGAATGGAGG TTCTTTCTGA 20 20base pairs nucleic acid single linear cDNA NO NO 189 GAATGGAGGTTCTTTCTGAT 20 20 base pairs nucleic acid single linear cDNA NO NO 190AATGGAGGTT CTTTCTGATG 20 20 base pairs nucleic acid single linear cDNANO NO 191 ATGGAGGTTC TTTCTGATGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 192 TGGAGGTTCT TTCTGATGTT 20 20 base pairs nucleicacid single linear cDNA NO NO 193 GGAGGTTCTT TCTGATGTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 194 GAGGTTCTTT CTGATGTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 195 AGGTTCTTTCTGATGTTTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 196GGTTCTTTCT GATGTTTTTT 20 20 base pairs nucleic acid single linear cDNANO NO 197 GTTCTTTCTG ATGTTTTTTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 198 TTCTTTCTGA TGTTTTTTGT 20 20 base pairs nucleicacid single linear cDNA NO NO 199 TCTTTCTGAT GTTTTTTGTC 20 20 base pairsnucleic acid single linear cDNA NO NO 200 CTTTCTGATG TTTTTTGTCT 20 20base pairs nucleic acid single linear cDNA NO NO 201 TTTCTGATGTTTTTTGTCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 202TTCTGATGTT TTTTGTCTGG 20 20 base pairs nucleic acid single linear cDNANO NO 203 TCTGATGTTT TTTGTCTGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 204 CTGATGTTTT TTGTCTGGTG 20 20 base pairs nucleicacid single linear cDNA NO NO 205 TGATGTTTTT TGTCTGGTGT 20 20 base pairsnucleic acid single linear cDNA NO NO 206 GATGTTTTTT GTCTGGTGTG 20 20base pairs nucleic acid single linear cDNA NO NO 207 ATGTTTTTTGTCTGGTGTGG 20 20 base pairs nucleic acid single linear cDNA NO NO 208TGTTTTTTGT CTGGTGTGGT 20 20 base pairs nucleic acid single linear cDNANO NO 209 GTTTTTTGTC TGGTGTGGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 210 TTTTTTGTCT GGTGTGGTAA 20 20 base pairs nucleicacid single linear cDNA NO NO 211 TTTTTGTCTG GTGTGGTAAG 20 20 base pairsnucleic acid single linear cDNA NO NO 212 TTTTGTCTGG TGTGGTAAGT 20 20base pairs nucleic acid single linear cDNA NO NO 213 TTTGTCTGGTGTGGTAAGTC 20 20 base pairs nucleic acid single linear cDNA NO NO 214TTGTCTGGTG TGGTAAGTCC 20 20 base pairs nucleic acid single linear cDNANO NO 215 TGTCTGGTGT GGTAAGTCCC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 216 GTCTGGTGTG GTAAGTCCCC 20 20 base pairs nucleicacid single linear cDNA NO NO 217 TCTGGTGTGG TAAGTCCCCA 20 20 base pairsnucleic acid single linear cDNA NO NO 218 CTGGTGTGGT AAGTCCCCAC 20 20base pairs nucleic acid single linear cDNA NO NO 219 TGGTGTGGTAAGTCCCCACC 20 20 base pairs nucleic acid single linear cDNA NO NO 220GGTGTGGTAA GTCCCCACCT 20 20 base pairs nucleic acid single linear cDNANO NO 221 GTGTGGTAAG TCCCCACCTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 222 TGTGGTAAGT CCCCACCTCA 20 20 base pairs nucleicacid single linear cDNA NO NO 223 GTGGTAAGTC CCCACCTCAA 20 20 base pairsnucleic acid single linear cDNA NO NO 224 TGGTAAGTCC CCACCTCAAC 20 20base pairs nucleic acid single linear cDNA NO NO 225 GGTAAGTCCCCACCTCAACA 20 20 base pairs nucleic acid single linear cDNA NO NO 226GTAAGTCCCC ACCTCAACAG 20 20 base pairs nucleic acid single linear cDNANO NO 227 TAAGTCCCCA CCTCAACAGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 228 AAGTCCCCAC CTCAACAGAT 20 20 base pairs nucleicacid single linear cDNA NO NO 229 AGTCCCCACC TCAACAGATG 20 20 base pairsnucleic acid single linear cDNA NO NO 230 GTCCCCACCT CAACAGATGT 20 20base pairs nucleic acid single linear cDNA NO NO 231 TCCCCACCTCAACAGATGTT 20 20 base pairs nucleic acid single linear cDNA NO NO 232CCCCACCTCA ACAGATGTTG 20 20 base pairs nucleic acid single linear cDNANO NO 233 CCCACCTCAA CAGATGTTGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 234 CCACCTCAAC AGATGTTGTC 20 20 base pairs nucleicacid single linear cDNA NO NO 235 CACCTCAACA GATGTTGTCT 20 20 base pairsnucleic acid single linear cDNA NO NO 236 ACCTCAACAG ATGTTGTCTC 20 20base pairs nucleic acid single linear cDNA NO NO 237 CCTCAACAGATGTTGTCTCA 20 20 base pairs nucleic acid single linear cDNA NO NO 238CTCAACAGAT GTTGTCTCAG 20 20 base pairs nucleic acid single linear cDNANO NO 239 TCAACAGATG TTGTCTCAGC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 240 CAACAGATGT TGTCTCAGCT 20 20 base pairs nucleicacid single linear cDNA NO NO 241 AACAGATGTT GTCTCAGCTC 20 20 base pairsnucleic acid single linear cDNA NO NO 242 ACAGATGTTG TCTCAGCTCC 20 20base pairs nucleic acid single linear cDNA NO NO 243 CAGATGTTGTCTCAGCTCCT 20 20 base pairs nucleic acid single linear cDNA NO NO 244AGATGTTGTC TCAGCTCCTC 20 20 base pairs nucleic acid single linear cDNANO NO 245 GATGTTGTCT CAGCTCCTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 246 ATGTTGTCTC AGCTCCTCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 247 TGTTGTCTCA GCTCCTCTAT 20 20 base pairsnucleic acid single linear cDNA NO NO 248 GTTGTCTCAG CTCCTCTATT 20 20base pairs nucleic acid single linear cDNA NO NO 249 TTGTCTCAGCTCCTCTATTT 20 20 base pairs nucleic acid single linear cDNA NO NO 250TGTCTCAGCT CCTCTATTTT 20 20 base pairs nucleic acid single linear cDNANO NO 251 GTCTCAGCTC CTCTATTTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 252 TCTCAGCTCC TCTATTTTTG 20 20 base pairs nucleicacid single linear cDNA NO NO 253 CTCAGCTCCT CTATTTTTGT 20 20 base pairsnucleic acid single linear cDNA NO NO 254 TCAGCTCCTC TATTTTTGTT 20 20base pairs nucleic acid single linear cDNA NO NO 255 CAGCTCCTCTATTTTTGTTC 20 20 base pairs nucleic acid single linear cDNA NO NO 256AGCTCCTCTA TTTTTGTTCT 20 20 base pairs nucleic acid single linear cDNANO NO 257 GCTCCTCTAT TTTTGTTCTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 258 CTCCTCTATT TTTGTTCTAT 20 20 base pairs nucleicacid single linear cDNA NO NO 259 TCCTCTATTT TTGTTCTATG 20 20 base pairsnucleic acid single linear cDNA NO NO 260 CCTCTATTTT TGTTCTATGC 20 20base pairs nucleic acid single linear cDNA NO NO 261 CTCTATTTTTGTTCTATGCT 20 20 base pairs nucleic acid single linear cDNA NO NO 262TCTATTTTTG TTCTATGCTG 20 20 base pairs nucleic acid single linear cDNANO NO 263 CTATTTTTGT TCTATGCTGC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 264 TATTTTTGTT CTATGCTGCC 20 20 base pairs nucleicacid single linear cDNA NO NO 265 ATTTTTGTTC TATGCTGCCC 20 20 base pairsnucleic acid single linear cDNA NO NO 266 TTTTTGTTCT ATGCTGCCCT 20 20base pairs nucleic acid single linear cDNA NO NO 267 TTTTGTTCTATGCTGCCCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 268TTTGTTCTAT GCTGCCCTAT 20 20 base pairs nucleic acid single linear cDNANO NO 269 TTGTTCTATG CTGCCCTATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 270 TGTTCTATGC TGCCCTATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 271 GTTCTATGCT GCCCTATTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 272 TTCTATGCTG CCCTATTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 273 TCTATGCTGCCCTATTTCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 274CTATGCTGCC CTATTTCTAA 20 20 base pairs nucleic acid single linear cDNANO NO 275 TATGCTGCCC TATTTCTAAG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 276 ATGCTGCCCT ATTTCTAAGT 20 20 base pairs nucleicacid single linear cDNA NO NO 277 TGCTGCCCTA TTTCTAAGTC 20 20 base pairsnucleic acid single linear cDNA NO NO 278 GCTGCCCTAT TTCTAAGTCA 20 20base pairs nucleic acid single linear cDNA NO NO 279 CTGCCCTATTTCTAAGTCAG 20 20 base pairs nucleic acid single linear cDNA NO NO 280TGCCCTATTT CTAAGTCAGA 20 20 base pairs nucleic acid single linear cDNANO NO 281 GCCCTATTTC TAAGTCAGAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 282 CCCTATTTCT AAGTCAGATC 20 20 base pairs nucleicacid single linear cDNA NO NO 283 CCTATTTCTA AGTCAGATCC 20 20 base pairsnucleic acid single linear cDNA NO NO 284 CTATTTCTAA GTCAGATCCT 20 20base pairs nucleic acid single linear cDNA NO NO 285 TATTTCTAAGTCAGATCCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 286ATTTCTAAGT CAGATCCTAC 20 20 base pairs nucleic acid single linear cDNANO NO 287 TTTCTAAGTC AGATCCTACA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 288 TTCTAAGTCA GATCCTACAT 20 20 base pairs nucleicacid single linear cDNA NO NO 289 TCTAAGTCAG ATCCTACATA 20 20 base pairsnucleic acid single linear cDNA NO NO 290 CTAAGTCAGA TCCTACATAC 20 20base pairs nucleic acid single linear cDNA NO NO 291 TAAGTCAGATCCTACATACA 20 20 base pairs nucleic acid single linear cDNA NO NO 292AAGTCAGATC CTACATACAA 20 20 base pairs nucleic acid single linear cDNANO NO 293 AGTCAGATCC TACATACAAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 294 GTCAGATCCT ACATACAAAT 20 20 base pairs nucleicacid single linear cDNA NO NO 295 TCAGATCCTA CATACAAATC 20 20 base pairsnucleic acid single linear cDNA NO NO 296 CAGATCCTAC ATACAAATCA 20 20base pairs nucleic acid single linear cDNA NO NO 297 AGATCCTACATACAAATCAT 20 20 base pairs nucleic acid single linear cDNA NO NO 298GATCCTACAT ACAAATCATC 20 20 base pairs nucleic acid single linear cDNANO NO 299 ATCCTACATA CAAATCATCC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 300 TCCTACATAC AAATCATCCA 20 20 base pairs nucleicacid single linear cDNA NO NO 301 CCTACATACA AATCATCCAT 20 20 base pairsnucleic acid single linear cDNA NO NO 302 CTACATACAA ATCATCCATG 20 20base pairs nucleic acid single linear cDNA NO NO 303 TACATACAAATCATCCATGT 20 20 base pairs nucleic acid single linear cDNA NO NO 304ACATACAAAT CATCCATGTA 20 20 base pairs nucleic acid single linear cDNANO NO 305 CATACAAATC ATCCATGTAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 306 ATACAAATCA TCCATGTATT 20 20 base pairs nucleicacid single linear cDNA NO NO 307 TACAAATCAT CCATGTATTG 20 20 base pairsnucleic acid single linear cDNA NO NO 308 ACAAATCATC CATGTATTGA 20 20base pairs nucleic acid single linear cDNA NO NO 309 CAAATCATCCATGTATTGAT 20 20 base pairs nucleic acid single linear cDNA NO NO 310AAATCATCCA TGTATTGATA 20 20 base pairs nucleic acid single linear cDNANO NO 311 AATCATCCAT GTATTGATAG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 312 ATCATCCATG TATTGATAGA 20 20 base pairs nucleicacid single linear cDNA NO NO 313 TCATCCATGT ATTGATAGAT 20 20 base pairsnucleic acid single linear cDNA NO NO 314 CATCCATGTA TTGATAGATA 20 20base pairs nucleic acid single linear cDNA NO NO 315 ATCCATGTATTGATAGATAA 20 20 base pairs nucleic acid single linear cDNA NO NO 316TCCATGTATT GATAGATAAC 20 20 base pairs nucleic acid single linear cDNANO NO 317 CCATGTATTG ATAGATAACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 318 CATGTATTGA TAGATAACTA 20 20 base pairs nucleicacid single linear cDNA NO NO 319 ATGTATTGAT AGATAACTAT 20 20 base pairsnucleic acid single linear cDNA NO NO 320 TGTATTGATA GATAACTATG 20 20base pairs nucleic acid single linear cDNA NO NO 321 GTATTGATAGATAACTATGT 20 20 base pairs nucleic acid single linear cDNA NO NO 322TATTGATAGA TAACTATGTC 20 20 base pairs nucleic acid single linear cDNANO NO 323 ATTGATAGAT AACTATGTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 324 TTGATAGATA ACTATGTCTG 20 20 base pairs nucleicacid single linear cDNA NO NO 325 TGATAGATAA CTATGTCTGG 20 20 base pairsnucleic acid single linear cDNA NO NO 326 GATAGATAAC TATGTCTGGA 20 20base pairs nucleic acid single linear cDNA NO NO 327 ATAGATAACTATGTCTGGAT 20 20 base pairs nucleic acid single linear cDNA NO NO 328TAGATAACTA TGTCTGGATT 20 20 base pairs nucleic acid single linear cDNANO NO 329 AGATAACTAT GTCTGGATTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 330 GATAACTATG TCTGGATTTT 20 20 base pairs nucleicacid single linear cDNA NO NO 331 ATAACTATGT CTGGATTTTG 20 20 base pairsnucleic acid single linear cDNA NO NO 332 TAACTATGTC TGGATTTTGT 20 20base pairs nucleic acid single linear cDNA NO NO 333 AACTATGTCTGGATTTTGTT 20 20 base pairs nucleic acid single linear cDNA NO NO 334ACTATGTCTG GATTTTGTTT 20 20 base pairs nucleic acid single linear cDNANO NO 335 CTATGTCTGG ATTTTGTTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 336 TATGTCTGGA TTTTGTTTTT 20 20 base pairs nucleicacid single linear cDNA NO NO 337 ATGTCTGGAT TTTGTTTTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 338 TGTCTGGATT TTGTTTTTTA 20 20base pairs nucleic acid single linear cDNA NO NO 339 GTCTGGATTTTGTTTTTTAA 20 20 base pairs nucleic acid single linear cDNA NO NO 340TCTGGATTTT GTTTTTTAAA 20 20 base pairs nucleic acid single linear cDNANO NO 341 CTGGATTTTG TTTTTTAAAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 342 TGGATTTTGT TTTTTAAAAG 20 20 base pairs nucleicacid single linear cDNA NO NO 343 GGATTTTGTT TTTTAAAAGG 20 20 base pairsnucleic acid single linear cDNA NO NO 344 GATTTTGTTT TTTAAAAGGC 20 20base pairs nucleic acid single linear cDNA NO NO 345 ATTTTGTTTTTTAAAAGGCT 20 20 base pairs nucleic acid single linear cDNA NO NO 346TTTTGTTTTT TAAAAGGCTC 20 20 base pairs nucleic acid single linear cDNANO NO 347 TTTGTTTTTT AAAAGGCTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 348 TTGTTTTTTA AAAGGCTCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 349 TGTTTTTTAA AAGGCTCTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 350 GTTTTTTAAA AGGCTCTAAG 20 20base pairs nucleic acid single linear cDNA NO NO 351 TTTTTTAAAAGGCTCTAAGA 20 20 base pairs nucleic acid single linear cDNA NO NO 352TTTTTAAAAG GCTCTAAGAT 20 20 base pairs nucleic acid single linear cDNANO NO 353 TTTTAAAAGG CTCTAAGATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 354 TTTAAAAGGC TCTAAGATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 355 TTAAAAGGCT CTAAGATTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 356 TAAAAGGCTC TAAGATTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 357 AAAAGGCTCTAAGATTTTTG 20 20 base pairs nucleic acid single linear cDNA NO NO 358AAAGGCTCTA AGATTTTTGT 20 20 base pairs nucleic acid single linear cDNANO NO 359 AAGGCTCTAA GATTTTTGTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 360 AGGCTCTAAG ATTTTTGTCA 20 20 base pairs nucleicacid single linear cDNA NO NO 361 GGCTCTAAGA TTTTTGTCAT 20 20 base pairsnucleic acid single linear cDNA NO NO 362 GCTCTAAGAT TTTTGTCATG 20 20base pairs nucleic acid single linear cDNA NO NO 363 CTCTAAGATTTTTGTCATGC 20 20 base pairs nucleic acid single linear cDNA NO NO 364TCTAAGATTT TTGTCATGCT 20 20 base pairs nucleic acid single linear cDNANO NO 365 CTAAGATTTT TGTCATGCTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 366 TAAGATTTTT GTCATGCTAC 20 20 base pairs nucleicacid single linear cDNA NO NO 367 AAGATTTTTG TCATGCTACT 20 20 base pairsnucleic acid single linear cDNA NO NO 368 AGATTTTTGT CATGCTACTT 20 20base pairs nucleic acid single linear cDNA NO NO 369 GATTTTTGTCATGCTACTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 370ATTTTTGTCA TGCTACTTTG 20 20 base pairs nucleic acid single linear cDNANO NO 371 TTTTTGTCAT GCTACTTTGG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 372 TTTTGTCATG CTACTTTGGA 20 20 base pairs nucleicacid single linear cDNA NO NO 373 TTTGTCATGC TACTTTGGAA 20 20 base pairsnucleic acid single linear cDNA NO NO 374 TTGTCATGCT ACTTTGGAAT 20 20base pairs nucleic acid single linear cDNA NO NO 375 TGTCATGCTACTTTGGAATA 20 20 base pairs nucleic acid single linear cDNA NO NO 376GTCATGCTAC TTTGGAATAT 20 20 base pairs nucleic acid single linear cDNANO NO 377 TCATGCTACT TTGGAATATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 378 CATGCTACTT TGGAATATTG 20 20 base pairs nucleicacid single linear cDNA NO NO 379 ATGCTACTTT GGAATATTGC 20 20 base pairsnucleic acid single linear cDNA NO NO 380 TGCTACTTTG GAATATTGCT 20 20base pairs nucleic acid single linear cDNA NO NO 381 GCTACTTTGGAATATTGCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 382CTACTTTGGA ATATTGCTGG 20 20 base pairs nucleic acid single linear cDNANO NO 383 TACTTTGGAA TATTGCTGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 384 ACTTTGGAAT ATTGCTGGTG 20 20 base pairs nucleicacid single linear cDNA NO NO 385 CTTTGGAATA TTGCTGGTGA 20 20 base pairsnucleic acid single linear cDNA NO NO 386 TTTGGAATAT TGCTGGTGAT 20 20base pairs nucleic acid single linear cDNA NO NO 387 TTGGAATATTGCTGGTGATC 20 20 base pairs nucleic acid single linear cDNA NO NO 388TGGAATATTG CTGGTGATCC 20 20 base pairs nucleic acid single linear cDNANO NO 389 GGAATATTGC TGGTGATCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 390 GAATATTGCT GGTGATCCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 391 AATATTGCTG GTGATCCTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 392 ATATTGCTGG TGATCCTTTC 20 20base pairs nucleic acid single linear cDNA NO NO 393 TATTGCTGGTGATCCTTTCC 20 20 base pairs nucleic acid single linear cDNA NO NO 394ATTGCTGGTG ATCCTTTCCA 20 20 base pairs nucleic acid single linear cDNANO NO 395 TTGCTGGTGA TCCTTTCCAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 396 TGCTGGTGAT CCTTTCCATC 20 20 base pairs nucleicacid single linear cDNA NO NO 397 GCTGGTGATC CTTTCCATCC 20 20 base pairsnucleic acid single linear cDNA NO NO 398 CTGGTGATCC TTTCCATCCC 20 20base pairs nucleic acid single linear cDNA NO NO 399 TGGTGATCCTTTCCATCCCT 20 20 base pairs nucleic acid single linear cDNA NO NO 400GGTGATCCTT TCCATCCCTG 20 20 base pairs nucleic acid single linear cDNANO NO 401 GTGATCCTTT CCATCCCTGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 402 TGATCCTTTC CATCCCTGTG 20 20 base pairs nucleicacid single linear cDNA NO NO 403 GATCCTTTCC ATCCCTGTGG 20 20 base pairsnucleic acid single linear cDNA NO NO 404 ATCCTTTCCA TCCCTGTGGA 20 20base pairs nucleic acid single linear cDNA NO NO 405 TCCTTTCCATCCCTGTGGAA 20 20 base pairs nucleic acid single linear cDNA NO NO 406CCTTTCCATC CCTGTGGAAG 20 20 base pairs nucleic acid single linear cDNANO NO 407 CTTTCCATCC CTGTGGAAGC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 408 TTTCCATCCC TGTGGAAGCA 20 20 base pairs nucleicacid single linear cDNA NO NO 409 TTCCATCCCT GTGGAAGCAC 20 20 base pairsnucleic acid single linear cDNA NO NO 410 TCCATCCCTG TGGAAGCACA 20 20base pairs nucleic acid single linear cDNA NO NO 411 CCATCCCTGTGGAAGCACAT 20 20 base pairs nucleic acid single linear cDNA NO NO 412CATCCCTGTG GAAGCACATT 20 20 base pairs nucleic acid single linear cDNANO NO 413 ATCCCTGTGG AAGCACATTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 414 TCCCTGTGGA AGCACATTGT 20 20 base pairs nucleicacid single linear cDNA NO NO 415 CCCTGTGGAA GCACATTGTA 20 20 base pairsnucleic acid single linear cDNA NO NO 416 CCTGTGGAAG CACATTGTAC 20 20base pairs nucleic acid single linear cDNA NO NO 417 CTGTGGAAGCACATTGTACT 20 20 base pairs nucleic acid single linear cDNA NO NO 418TGTGGAAGCA CATTGTACTG 20 20 base pairs nucleic acid single linear cDNANO NO 419 GTGGAAGCAC ATTGTACTGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 420 TGGAAGCACA TTGTACTGAT 20 20 base pairs nucleicacid single linear cDNA NO NO 421 GGAAGCACAT TGTACTGATA 20 20 base pairsnucleic acid single linear cDNA NO NO 422 GAAGCACATT GTACTGATAT 20 20base pairs nucleic acid single linear cDNA NO NO 423 AAGCACATTGTACTGATATC 20 20 base pairs nucleic acid single linear cDNA NO NO 424AGCACATTGT ACTGATATCT 20 20 base pairs nucleic acid single linear cDNANO NO 425 GCACATTGTA CTGATATCTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 426 CACATTGTAC TGATATCTAA 20 20 base pairs nucleicacid single linear cDNA NO NO 427 ACATTGTACT GATATCTAAT 20 20 base pairsnucleic acid single linear cDNA NO NO 428 CATTGTACTG ATATCTAATC 20 20base pairs nucleic acid single linear cDNA NO NO 429 ATTGTACTGATATCTAATCC 20 20 base pairs nucleic acid single linear cDNA NO NO 430TTGTACTGAT ATCTAATCCC 20 20 base pairs nucleic acid single linear cDNANO NO 431 TGTACTGATA TCTAATCCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 432 GTACTGATAT CTAATCCCTG 20 20 base pairs nucleicacid single linear cDNA NO NO 433 TACTGATATC TAATCCCTGG 20 20 base pairsnucleic acid single linear cDNA NO NO 434 ACTGATATCT AATCCCTGGT 20 20base pairs nucleic acid single linear cDNA NO NO 435 CTGATATCTAATCCCTGGTG 20 20 base pairs nucleic acid single linear cDNA NO NO 436TGATATCTAA TCCCTGGTGT 20 20 base pairs nucleic acid single linear cDNANO NO 437 GATATCTAAT CCCTGGTGTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 438 ATATCTAATC CCTGGTGTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 439 TATCTAATCC CTGGTGTCTC 20 20 base pairsnucleic acid single linear cDNA NO NO 440 ATCTAATCCC TGGTGTCTCA 20 20base pairs nucleic acid single linear cDNA NO NO 441 TCTAATCCCTGGTGTCTCAT 20 20 base pairs nucleic acid single linear cDNA NO NO 442CTAATCCCTG GTGTCTCATT 20 20 base pairs nucleic acid single linear cDNANO NO 443 TAATCCCTGG TGTCTCATTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 444 AATCCCTGGT GTCTCATTGT 20 20 base pairs nucleicacid single linear cDNA NO NO 445 ATCCCTGGTG TCTCATTGTT 20 20 base pairsnucleic acid single linear cDNA NO NO 446 TCCCTGGTGT CTCATTGTTT 20 20base pairs nucleic acid single linear cDNA NO NO 447 CCCTGGTGTCTCATTGTTTA 20 20 base pairs nucleic acid single linear cDNA NO NO 448CCTGGTGTCT CATTGTTTAT 20 20 base pairs nucleic acid single linear cDNANO NO 449 CTGGTGTCTC ATTGTTTATA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 450 TGGTGTCTCA TTGTTTATAC 20 20 base pairs nucleicacid single linear cDNA NO NO 451 GGTGTCTCAT TGTTTATACT 20 20 base pairsnucleic acid single linear cDNA NO NO 452 GTGTCTCATT GTTTATACTA 20 20base pairs nucleic acid single linear cDNA NO NO 453 TGTCTCATTGTTTATACTAG 20 20 base pairs nucleic acid single linear cDNA NO NO 454GTCTCATTGT TTATACTAGG 20 20 base pairs nucleic acid single linear cDNANO NO 455 TCTCATTGTT TATACTAGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 456 CTCATTGTTT ATACTAGGTA 20 20 base pairs nucleicacid single linear cDNA NO NO 457 TCATTGTTTA TACTAGGTAT 20 20 base pairsnucleic acid single linear cDNA NO NO 458 CATTGTTTAT ACTAGGTATG 20 20base pairs nucleic acid single linear cDNA NO NO 459 ATTGTTTATACTAGGTATGG 20 20 base pairs nucleic acid single linear cDNA NO NO 460TTGTTTATAC TAGGTATGGT 20 20 base pairs nucleic acid single linear cDNANO NO 461 TGTTTATACT AGGTATGGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 462 GTTTATACTA GGTATGGTAA 20 20 base pairs nucleicacid single linear cDNA NO NO 463 TTTATACTAG GTATGGTAAA 20 20 base pairsnucleic acid single linear cDNA NO NO 464 TTATACTAGG TATGGTAAAT 20 20base pairs nucleic acid single linear cDNA NO NO 465 TATACTAGGTATGGTAAATG 20 20 base pairs nucleic acid single linear cDNA NO NO 466ATACTAGGTA TGGTAAATGC 20 20 base pairs nucleic acid single linear cDNANO NO 467 TACTAGGTAT GGTAAATGCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 468 ACTAGGTATG GTAAATGCAG 20 20 base pairs nucleicacid single linear cDNA NO NO 469 CTAGGTATGG TAAATGCAGT 20 20 base pairsnucleic acid single linear cDNA NO NO 470 TAGGTATGGT AAATGCAGTA 20 20base pairs nucleic acid single linear cDNA NO NO 471 AGGTATGGTAAATGCAGTAT 20 20 base pairs nucleic acid single linear cDNA NO NO 472GGTATGGTAA ATGCAGTATA 20 20 base pairs nucleic acid single linear cDNANO NO 473 GTATGGTAAA TGCAGTATAC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 474 TATGGTAAAT GCAGTATACT 20 20 base pairs nucleicacid single linear cDNA NO NO 475 ATGGTAAATG CAGTATACTT 20 20 base pairsnucleic acid single linear cDNA NO NO 476 TGGTAAATGC AGTATACTTC 20 20base pairs nucleic acid single linear cDNA NO NO 477 GGTAAATGCAGTATACTTCC 20 20 base pairs nucleic acid single linear cDNA NO NO 478GTAAATGCAG TATACTTCCT 20 20 base pairs nucleic acid single linear cDNANO NO 479 TAAATGCAGT ATACTTCCTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 480 AAATGCAGTA TACTTCCTGA 20 20 base pairs nucleicacid single linear cDNA NO NO 481 AATGCAGTAT ACTTCCTGAA 20 20 base pairsnucleic acid single linear cDNA NO NO 482 ATGCAGTATA CTTCCTGAAG 20 20base pairs nucleic acid single linear cDNA NO NO 483 TGCAGTATACTTCCTGAAGT 20 20 base pairs nucleic acid single linear cDNA NO NO 484GCAGTATACT TCCTGAAGTC 20 20 base pairs nucleic acid single linear cDNANO NO 485 CAGTATACTT CCTGAAGTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 486 AGTATACTTC CTGAAGTCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 487 GTATACTTCC TGAAGTCTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 488 TATACTTCCT GAAGTCTTCA 20 20base pairs nucleic acid single linear cDNA NO NO 489 ATACTTCCTGAAGTCTTCAT 20 20 base pairs nucleic acid single linear cDNA NO NO 490TACTTCCTGA AGTCTTCATC 20 20 base pairs nucleic acid single linear cDNANO NO 491 ACTTCCTGAA GTCTTCATCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 492 CTTCCTGAAG TCTTCATCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 493 TTCCTGAAGT CTTCATCTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 494 TCCTGAAGTC TTCATCTAAG 20 20base pairs nucleic acid single linear cDNA NO NO 495 CCTGAAGTCTTCATCTAAGG 20 20 base pairs nucleic acid single linear cDNA NO NO 496CTGAAGTCTT CATCTAAGGG 20 20 base pairs nucleic acid single linear cDNANO NO 497 TGAAGTCTTC ATCTAAGGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 498 GAAGTCTTCA TCTAAGGGAA 20 20 base pairs nucleicacid single linear cDNA NO NO 499 AAGTCTTCAT CTAAGGGAAC 20 20 base pairsnucleic acid single linear cDNA NO NO 500 AGTCTTCATC TAAGGGAACT 20 20base pairs nucleic acid single linear cDNA NO NO 501 GTCTTCATCTAAGGGAACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 502TCTTCATCTA AGGGAACTGA 20 20 base pairs nucleic acid single linear cDNANO NO 503 CTTCATCTAA GGGAACTGAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 504 TTCATCTAAG GGAACTGAAA 20 20 base pairs nucleicacid single linear cDNA NO NO 505 TCATCTAAGG GAACTGAAAA 20 20 base pairsnucleic acid single linear cDNA NO NO 506 CATCTAAGGG AACTGAAAAA 20 20base pairs nucleic acid single linear cDNA NO NO 507 ATCTAAGGGAACTGAAAAAT 20 20 base pairs nucleic acid single linear cDNA NO NO 508TCTAAGGGAA CTGAAAAATA 20 20 base pairs nucleic acid single linear cDNANO NO 509 CTAAGGGAAC TGAAAAATAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 510 TAAGGGAACT GAAAAATATG 20 20 base pairs nucleicacid single linear cDNA NO NO 511 AAGGGAACTG AAAAATATGC 20 20 base pairsnucleic acid single linear cDNA NO NO 512 AGGGAACTGA AAAATATGCA 20 20base pairs nucleic acid single linear cDNA NO NO 513 GGGAACTGAAAAATATGCAT 20 20 base pairs nucleic acid single linear cDNA NO NO 514GGAACTGAAA AATATGCATC 20 20 base pairs nucleic acid single linear cDNANO NO 515 GAACTGAAAA ATATGCATCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 516 AACTGAAAAA TATGCATCAC 20 20 base pairs nucleicacid single linear cDNA NO NO 517 ACTGAAAAAT ATGCATCACC 20 20 base pairsnucleic acid single linear cDNA NO NO 518 CTGAAAAATA TGCATCACCC 20 20base pairs nucleic acid single linear cDNA NO NO 519 TGAAAAATATGCATCACCCA 20 20 base pairs nucleic acid single linear cDNA NO NO 520GAAAAATATG CATCACCCAC 20 20 base pairs nucleic acid single linear cDNANO NO 521 AAAAATATGC ATCACCCACA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 522 AAAATATGCA TCACCCACAT 20 20 base pairs nucleicacid single linear cDNA NO NO 523 AAATATGCAT CACCCACATC 20 20 base pairsnucleic acid single linear cDNA NO NO 524 AATATGCATC ACCCACATCC 20 20base pairs nucleic acid single linear cDNA NO NO 525 ATATGCATCACCCACATCCA 20 20 base pairs nucleic acid single linear cDNA NO NO 526TATGCATCAC CCACATCCAG 20 20 base pairs nucleic acid single linear cDNANO NO 527 ATGCATCACC CACATCCAGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 528 TGCATCACCC ACATCCAGTA 20 20 base pairs nucleicacid single linear cDNA NO NO 529 GCATCACCCA CATCCAGTAC 20 20 base pairsnucleic acid single linear cDNA NO NO 530 CATCACCCAC ATCCAGTACT 20 20base pairs nucleic acid single linear cDNA NO NO 531 ATCACCCACATCCAGTACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 532TCACCCACAT CCAGTACTGT 20 20 base pairs nucleic acid single linear cDNANO NO 533 CACCCACATC CAGTACTGTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 534 ACCCACATCC AGTACTGTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 535 CCCACATCCA GTACTGTTAC 20 20 base pairsnucleic acid single linear cDNA NO NO 536 CCACATCCAG TACTGTTACT 20 20base pairs nucleic acid single linear cDNA NO NO 537 CACATCCAGTACTGTTACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 538ACATCCAGTA CTGTTACTGA 20 20 base pairs nucleic acid single linear cDNANO NO 539 CATCCAGTAC TGTTACTGAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 540 ATCCAGTACT GTTACTGATT 20 20 base pairs nucleicacid single linear cDNA NO NO 541 TCCAGTACTG TTACTGATTT 20 20 base pairsnucleic acid single linear cDNA NO NO 542 CCAGTACTGT TACTGATTTT 20 20base pairs nucleic acid single linear cDNA NO NO 543 CAGTACTGTTACTGATTTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 544AGTACTGTTA CTGATTTTTT 20 20 base pairs nucleic acid single linear cDNANO NO 545 GTACTGTTAC TGATTTTTTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 546 TACTGTTACT GATTTTTTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 547 ACTGTTACTG ATTTTTTCTT 20 20 base pairsnucleic acid single linear cDNA NO NO 548 CTGTTACTGA TTTTTTCTTT 20 20base pairs nucleic acid single linear cDNA NO NO 549 TGTTACTGATTTTTTCTTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 550GTTACTGATT TTTTCTTTTT 20 20 base pairs nucleic acid single linear cDNANO NO 551 TTACTGATTT TTTCTTTTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 552 TACTGATTTT TTCTTTTTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 553 ACTGATTTTT TCTTTTTTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 554 CTGATTTTTT CTTTTTTAAC 20 20base pairs nucleic acid single linear cDNA NO NO 555 TGATTTTTTCTTTTTTAACC 20 20 base pairs nucleic acid single linear cDNA NO NO 556GATTTTTTCT TTTTTAACCC 20 20 base pairs nucleic acid single linear cDNANO NO 557 ATTTTTTCTT TTTTAACCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 558 TTTTTTCTTT TTTAACCCTG 20 20 base pairs nucleicacid single linear cDNA NO NO 559 TTTTTCTTTT TTAACCCTGC 20 20 base pairsnucleic acid single linear cDNA NO NO 560 TTTTCTTTTT TAACCCTGCG 20 20base pairs nucleic acid single linear cDNA NO NO 561 TTTCTTTTTTAACCCTGCGG 20 20 base pairs nucleic acid single linear cDNA NO NO 562TTCTTTTTTA ACCCTGCGGG 20 20 base pairs nucleic acid single linear cDNANO NO 563 TCTTTTTTAA CCCTGCGGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 564 CTTTTTTAAC CCTGCGGGAT 20 20 base pairs nucleicacid single linear cDNA NO NO 565 TTTTTTAACC CTGCGGGATG 20 20 base pairsnucleic acid single linear cDNA NO NO 566 TTTTTAACCC TGCGGGATGT 20 20base pairs nucleic acid single linear cDNA NO NO 567 TTTTAACCCTGCGGGATGTG 20 20 base pairs nucleic acid single linear cDNA NO NO 568TTTAACCCTG CGGGATGTGG 20 20 base pairs nucleic acid single linear cDNANO NO 569 TTAACCCTGC GGGATGTGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 570 TAACCCTGCG GGATGTGGTA 20 20 base pairs nucleicacid single linear cDNA NO NO 571 AACCCTGCGG GATGTGGTAT 20 20 base pairsnucleic acid single linear cDNA NO NO 572 ACCCTGCGGG ATGTGGTATT 20 20base pairs nucleic acid single linear cDNA NO NO 573 CCCTGCGGGATGTGGTATTC 20 20 base pairs nucleic acid single linear cDNA NO NO 574CCTGCGGGAT GTGGTATTCC 20 20 base pairs nucleic acid single linear cDNANO NO 575 CTGCGGGATG TGGTATTCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 576 TGCGGGATGT GGTATTCCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 577 GCGGGATGTG GTATTCCTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 578 CGGGATGTGG TATTCCTAAT 20 20base pairs nucleic acid single linear cDNA NO NO 579 GGGATGTGGTATTCCTAATT 20 20 base pairs nucleic acid single linear cDNA NO NO 580GGATGTGGTA TTCCTAATTG 20 20 base pairs nucleic acid single linear cDNANO NO 581 GATGTGGTAT TCCTAATTGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 582 ATGTGGTATT CCTAATTGAA 20 20 base pairs nucleicacid single linear cDNA NO NO 583 TGTGGTATTC CTAATTGAAC 20 20 base pairsnucleic acid single linear cDNA NO NO 584 GTGGTATTCC TAATTGAACT 20 20base pairs nucleic acid single linear cDNA NO NO 585 TGGTATTCCTAATTGAACTT 20 20 base pairs nucleic acid single linear cDNA NO NO 586GGTATTCCTA ATTGAACTTC 20 20 base pairs nucleic acid single linear cDNANO NO 587 GTATTCCTAA TTGAACTTCC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 588 TATTCCTAAT TGAACTTCCC 20 20 base pairs nucleicacid single linear cDNA NO NO 589 ATTCCTAATT GAACTTCCCA 20 20 base pairsnucleic acid single linear cDNA NO NO 590 TTCCTAATTG AACTTCCCAG 20 20base pairs nucleic acid single linear cDNA NO NO 591 TCCTAATTGAACTTCCCAGA 20 20 base pairs nucleic acid single linear cDNA NO NO 592CCTAATTGAA CTTCCCAGAA 20 20 base pairs nucleic acid single linear cDNANO NO 593 CTAATTGAAC TTCCCAGAAG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 594 TAATTGAACT TCCCAGAAGT 20 20 base pairs nucleicacid single linear cDNA NO NO 595 AATTGAACTT CCCAGAAGTC 20 20 base pairsnucleic acid single linear cDNA NO NO 596 ATTGAACTTC CCAGAAGTCT 20 20base pairs nucleic acid single linear cDNA NO NO 597 TTGAACTTCCCAGAAGTCTT 20 20 base pairs nucleic acid single linear cDNA NO NO 598TGAACTTCCC AGAAGTCTTG 20 20 base pairs nucleic acid single linear cDNANO NO 599 GAACTTCCCA GAAGTCTTGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 600 AACTTCCCAG AAGTCTTGAG 20 20 base pairs nucleicacid single linear cDNA NO NO 601 ACTTCCCAGA AGTCTTGAGT 20 20 base pairsnucleic acid single linear cDNA NO NO 602 CTTCCCAGAA GTCTTGAGTT 20 20base pairs nucleic acid single linear cDNA NO NO 603 TTCCCAGAAGTCTTGAGTTC 20 20 base pairs nucleic acid single linear cDNA NO NO 604TCCCAGAAGT CTTGAGTTCT 20 20 base pairs nucleic acid single linear cDNANO NO 605 CCCAGAAGTC TTGAGTTCTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 606 CCAGAAGTCT TGAGTTCTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 607 CAGAAGTCTT GAGTTCTCTT 20 20 base pairsnucleic acid single linear cDNA NO NO 608 AGAAGTCTTG AGTTCTCTTA 20 20base pairs nucleic acid single linear cDNA NO NO 609 GAAGTCTTGAGTTCTCTTAT 20 20 base pairs nucleic acid single linear cDNA NO NO 610AAGTCTTGAG TTCTCTTATT 20 20 base pairs nucleic acid single linear cDNANO NO 611 AGTCTTGAGT TCTCTTATTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 612 GTCTTGAGTT CTCTTATTAA 20 20 base pairs nucleicacid single linear cDNA NO NO 613 TCTTGAGTTC TCTTATTAAG 20 20 base pairsnucleic acid single linear cDNA NO NO 614 CTTGAGTTCT CTTATTAAGT 20 20base pairs nucleic acid single linear cDNA NO NO 615 TTGAGTTCTCTTATTAAGTT 20 20 base pairs nucleic acid single linear cDNA NO NO 616TGAGTTCTCT TATTAAGTTC 20 20 base pairs nucleic acid single linear cDNANO NO 617 GAGTTCTCTT ATTAAGTTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 618 AGTTCTCTTA TTAAGTTCTC 20 20 base pairs nucleicacid single linear cDNA NO NO 619 GTTCTCTTAT TAAGTTCTCT 20 20 base pairsnucleic acid single linear cDNA NO NO 620 TTCTCTTATT AAGTTCTCTG 20 20base pairs nucleic acid single linear cDNA NO NO 621 TCTCTTATTAAGTTCTCTGA 20 20 base pairs nucleic acid single linear cDNA NO NO 622CTCTTATTAA GTTCTCTGAA 20 20 base pairs nucleic acid single linear cDNANO NO 623 TCTTATTAAG TTCTCTGAAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 624 CTTATTAAGT TCTCTGAAAT 20 20 base pairs nucleicacid single linear cDNA NO NO 625 TTATTAAGTT CTCTGAAATC 20 20 base pairsnucleic acid single linear cDNA NO NO 626 TATTAAGTTC TCTGAAATCT 20 20base pairs nucleic acid single linear cDNA NO NO 627 ATTAAGTTCTCTGAAATCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 628TTAAGTTCTC TGAAATCTAC 20 20 base pairs nucleic acid single linear cDNANO NO 629 TAAGTTCTCT GAAATCTACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 630 AAGTTCTCTG AAATCTACTA 20 20 base pairs nucleicacid single linear cDNA NO NO 631 AGTTCTCTGA AATCTACTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 632 GTTCTCTGAA ATCTACTAAT 20 20base pairs nucleic acid single linear cDNA NO NO 633 TTCTCTGAAATCTACTAATT 20 20 base pairs nucleic acid single linear cDNA NO NO 634TCTCTGAAAT CTACTAATTT 20 20 base pairs nucleic acid single linear cDNANO NO 635 CTCTGAAATC TACTAATTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 636 TCTGAAATCT ACTAATTTTC 20 20 base pairs nucleicacid single linear cDNA NO NO 637 CTGAAATCTA CTAATTTTCT 20 20 base pairsnucleic acid single linear cDNA NO NO 638 TGAAATCTAC TAATTTTCTC 20 20base pairs nucleic acid single linear cDNA NO NO 639 GAAATCTACTAATTTTCTCC 20 20 base pairs nucleic acid single linear cDNA NO NO 640AAATCTACTA ATTTTCTCCA 20 20 base pairs nucleic acid single linear cDNANO NO 641 AATCTACTAA TTTTCTCCAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 642 ATCTACTAAT TTTCTCCATT 20 20 base pairs nucleicacid single linear cDNA NO NO 643 TCTACTAATT TTCTCCATTT 20 20 base pairsnucleic acid single linear cDNA NO NO 644 CTACTAATTT TCTCCATTTA 20 20base pairs nucleic acid single linear cDNA NO NO 645 TACTAATTTTCTCCATTTAG 20 20 base pairs nucleic acid single linear cDNA NO NO 646ACTAATTTTC TCCATTTAGT 20 20 base pairs nucleic acid single linear cDNANO NO 647 CTAATTTTCT CCATTTAGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 648 TAATTTTCTC CATTTAGTAC 20 20 base pairs nucleicacid single linear cDNA NO NO 649 AATTTTCTCC ATTTAGTACT 20 20 base pairsnucleic acid single linear cDNA NO NO 650 ATTTTCTCCA TTTAGTACTG 20 20base pairs nucleic acid single linear cDNA NO NO 651 TTTTCTCCATTTAGTACTGT 20 20 base pairs nucleic acid single linear cDNA NO NO 652TTTCTCCATT TAGTACTGTC 20 20 base pairs nucleic acid single linear cDNANO NO 653 TTCTCCATTT AGTACTGTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 654 TCTCCATTTA GTACTGTCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 655 CTCCATTTAG TACTGTCTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 656 TCCATTTAGT ACTGTCTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 657 CCATTTAGTACTGTCTTTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 658CATTTAGTAC TGTCTTTTTT 20 20 base pairs nucleic acid single linear cDNANO NO 659 ATTTAGTACT GTCTTTTTTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 660 TTTAGTACTG TCTTTTTTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 661 TTAGTACTGT CTTTTTTCTT 20 20 base pairsnucleic acid single linear cDNA NO NO 662 TAGTACTGTC TTTTTTCTTT 20 20base pairs nucleic acid single linear cDNA NO NO 663 AGTACTGTCTTTTTTCTTTA 20 20 base pairs nucleic acid single linear cDNA NO NO 664GTACTGTCTT TTTTCTTTAT 20 20 base pairs nucleic acid single linear cDNANO NO 665 TACTGTCTTT TTTCTTTATG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 666 ACTGTCTTTT TTCTTTATGG 20 20 base pairs nucleicacid single linear cDNA NO NO 667 CTGTCTTTTT TCTTTATGGC 20 20 base pairsnucleic acid single linear cDNA NO NO 668 TGTCTTTTTT CTTTATGGCA 20 20base pairs nucleic acid single linear cDNA NO NO 669 GTCTTTTTTCTTTATGGCAA 20 20 base pairs nucleic acid single linear cDNA NO NO 670TCTTTTTTCT TTATGGCAAA 20 20 base pairs nucleic acid single linear cDNANO NO 671 CTTTTTTCTT TATGGCAAAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 672 TTTTTTCTTT ATGGCAAATA 20 20 base pairs nucleicacid single linear cDNA NO NO 673 TTTTTCTTTA TGGCAAATAC 20 20 base pairsnucleic acid single linear cDNA NO NO 674 TTTTCTTTAT GGCAAATACT 20 20base pairs nucleic acid single linear cDNA NO NO 675 TTTCTTTATGGCAAATACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 676TTCTTTATGG CAAATACTGG 20 20 base pairs nucleic acid single linear cDNANO NO 677 TCTTTATGGC AAATACTGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 678 CTTTATGGCA AATACTGGAG 20 20 base pairs nucleicacid single linear cDNA NO NO 679 TTTATGGCAA ATACTGGAGT 20 20 base pairsnucleic acid single linear cDNA NO NO 680 TTATGGCAAA TACTGGAGTA 20 20base pairs nucleic acid single linear cDNA NO NO 681 TATGGCAAATACTGGAGTAT 20 20 base pairs nucleic acid single linear cDNA NO NO 682ATGGCAAATA CTGGAGTATT 20 20 base pairs nucleic acid single linear cDNANO NO 683 TGGCAAATAC TGGAGTATTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 684 GGCAAATACT GGAGTATTGT 20 20 base pairs nucleicacid single linear cDNA NO NO 685 GCAAATACTG GAGTATTGTA 20 20 base pairsnucleic acid single linear cDNA NO NO 686 CAAATACTGG AGTATTGTAT 20 20base pairs nucleic acid single linear cDNA NO NO 687 AAATACTGGAGTATTGTATG 20 20 base pairs nucleic acid single linear cDNA NO NO 688AATACTGGAG TATTGTATGG 20 20 base pairs nucleic acid single linear cDNANO NO 689 ATACTGGAGT ATTGTATGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 690 TACTGGAGTA TTGTATGGAT 20 20 base pairs nucleicacid single linear cDNA NO NO 691 ACTGGAGTAT TGTATGGATT 20 20 base pairsnucleic acid single linear cDNA NO NO 692 CTGGAGTATT GTATGGATTC 20 20base pairs nucleic acid single linear cDNA NO NO 693 TGGAGTATTGTATGGATTCT 20 20 base pairs nucleic acid single linear cDNA NO NO 694GGAGTATTGT ATGGATTCTC 20 20 base pairs nucleic acid single linear cDNANO NO 695 GAGTATTGTA TGGATTCTCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 696 AGTATTGTAT GGATTCTCAG 20 20 base pairs nucleicacid single linear cDNA NO NO 697 GTATTGTATG GATTCTCAGG 20 20 base pairsnucleic acid single linear cDNA NO NO 698 TATTGTATGG ATTCTCAGGC 20 20base pairs nucleic acid single linear cDNA NO NO 699 ATTGTATGGATTCTCAGGCC 20 20 base pairs nucleic acid single linear cDNA NO NO 700TTGTATGGAT TCTCAGGCCC 20 20 base pairs nucleic acid single linear cDNANO NO 701 TGTATGGATT CTCAGGCCCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 702 GTATGGATTC TCAGGCCCAA 20 20 base pairs nucleicacid single linear cDNA NO NO 703 TATGGATTCT CAGGCCCAAT 20 20 base pairsnucleic acid single linear cDNA NO NO 704 ATGGATTCTC AGGCCCAATT 20 20base pairs nucleic acid single linear cDNA NO NO 705 TGGATTCTCAGGCCCAATTT 20 20 base pairs nucleic acid single linear cDNA NO NO 706GGATTCTCAG GCCCAATTTT 20 20 base pairs nucleic acid single linear cDNANO NO 707 GATTCTCAGG CCCAATTTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 708 ATTCTCAGGC CCAATTTTTG 20 20 base pairs nucleicacid single linear cDNA NO NO 709 TTCTCAGGCC CAATTTTTGA 20 20 base pairsnucleic acid single linear cDNA NO NO 710 TCTCAGGCCC AATTTTTGAA 20 20base pairs nucleic acid single linear cDNA NO NO 711 CTCAGGCCCAATTTTTGAAA 20 20 base pairs nucleic acid single linear cDNA NO NO 712TCAGGCCCAA TTTTTGAAAT 20 20 base pairs nucleic acid single linear cDNANO NO 713 CAGGCCCAAT TTTTGAAATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 714 AGGCCCAATT TTTGAAATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 715 GGCCCAATTT TTGAAATTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 716 GCCCAATTTT TGAAATTTTC 20 20base pairs nucleic acid single linear cDNA NO NO 717 CCCAATTTTTGAAATTTTCC 20 20 base pairs nucleic acid single linear cDNA NO NO 718CCAATTTTTG AAATTTTCCC 20 20 base pairs nucleic acid single linear cDNANO NO 719 CAATTTTTGA AATTTTCCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 720 AATTTTTGAA ATTTTCCCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 721 ATTTTTGAAA TTTTCCCTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 722 TTTTTGAAAT TTTCCCTTCC 20 20base pairs nucleic acid single linear cDNA NO NO 723 TTTTGAAATTTTCCCTTCCT 20 20 base pairs nucleic acid single linear cDNA NO NO 724TTTGAAATTT TCCCTTCCTT 20 20 base pairs nucleic acid single linear cDNANO NO 725 TTGAAATTTT CCCTTCCTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 726 TGAAATTTTC CCTTCCTTTT 20 20 base pairs nucleicacid single linear cDNA NO NO 727 GAAATTTTCC CTTCCTTTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 728 AAATTTTCCC TTCCTTTTCC 20 20base pairs nucleic acid single linear cDNA NO NO 729 AATTTTCCCTTCCTTTTCCA 20 20 base pairs nucleic acid single linear cDNA NO NO 730ATTTTCCCTT CCTTTTCCAT 20 20 base pairs nucleic acid single linear cDNANO NO 731 TTTTCCCTTC CTTTTCCATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 732 TTTCCCTTCC TTTTCCATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 733 TTCCCTTCCT TTTCCATTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 734 TCCCTTCCTT TTCCATTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 735 CCCTTCCTTTTCCATTTCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 736CCTTCCTTTT CCATTTCTGT 20 20 base pairs nucleic acid single linear cDNANO NO 737 CTTCCTTTTC CATTTCTGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 738 TTCCTTTTCC ATTTCTGTAC 20 20 base pairs nucleicacid single linear cDNA NO NO 739 TCCTTTTCCA TTTCTGTACA 20 20 base pairsnucleic acid single linear cDNA NO NO 740 CCTTTTCCAT TTCTGTACAA 20 20base pairs nucleic acid single linear cDNA NO NO 741 CTTTTCCATTTCTGTACAAA 20 20 base pairs nucleic acid single linear cDNA NO NO 742TTTTCCATTT CTGTACAAAT 20 20 base pairs nucleic acid single linear cDNANO NO 743 TTTCCATTTC TGTACAAATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 744 TTCCATTTCT GTACAAATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 745 TCCATTTCTG TACAAATTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 746 CCATTTCTGT ACAAATTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 747 CATTTCTGTACAAATTTCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 748ATTTCTGTAC AAATTTCTAC 20 20 base pairs nucleic acid single linear cDNANO NO 749 TTTCTGTACA AATTTCTACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 750 TTCTGTACAA ATTTCTACTA 20 20 base pairs nucleicacid single linear cDNA NO NO 751 TCTGTACAAA TTTCTACTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 752 CTGTACAAAT TTCTACTAAT 20 20base pairs nucleic acid single linear cDNA NO NO 753 TGTACAAATTTCTACTAATG 20 20 base pairs nucleic acid single linear cDNA NO NO 754GTACAAATTT CTACTAATGC 20 20 base pairs nucleic acid single linear cDNANO NO 755 TACAAATTTC TACTAATGCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 756 ACAAATTTCT ACTAATGCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 757 CAAATTTCTA CTAATGCTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 758 AAATTTCTAC TAATGCTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 759 AATTTCTACTAATGCTTTTA 20 20 base pairs nucleic acid single linear cDNA NO NO 760ATTTCTACTA ATGCTTTTAT 20 20 base pairs nucleic acid single linear cDNANO NO 761 TTTCTACTAA TGCTTTTATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 762 TTCTACTAAT GCTTTTATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 763 TCTACTAATG CTTTTATTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 764 CTACTAATGC TTTTATTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 765 TACTAATGCTTTTATTTTTT 20 20 base pairs nucleic acid single linear cDNA NO NO 766ACTAATGCTT TTATTTTTTC 20 20 base pairs nucleic acid single linear cDNANO NO 767 CTAATGCTTT TATTTTTTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 768 TAATGCTTTT ATTTTTTCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 769 AATGCTTTTA TTTTTTCTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 770 ATGCTTTTAT TTTTTCTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 771 TGCTTTTATTTTTTCTTCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 772GCTTTTATTT TTTCTTCTGT 20 20 base pairs nucleic acid single linear cDNANO NO 773 CTTTTATTTT TTCTTCTGTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 774 TTTTATTTTT TCTTCTGTCA 20 20 base pairs nucleicacid single linear cDNA NO NO 775 TTTATTTTTT CTTCTGTCAA 20 20 base pairsnucleic acid single linear cDNA NO NO 776 TTATTTTTTC TTCTGTCAAT 20 20base pairs nucleic acid single linear cDNA NO NO 777 TATTTTTTCTTCTGTCAATG 20 20 base pairs nucleic acid single linear cDNA NO NO 778ATTTTTTCTT CTGTCAATGG 20 20 base pairs nucleic acid single linear cDNANO NO 779 TTTTTTCTTC TGTCAATGGC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 780 TTTTTCTTCT GTCAATGGCC 20 20 base pairs nucleicacid single linear cDNA NO NO 781 TTTTCTTCTG TCAATGGCCA 20 20 base pairsnucleic acid single linear cDNA NO NO 782 TTTCTTCTGT CAATGGCCAT 20 20base pairs nucleic acid single linear cDNA NO NO 783 TTCTTCTGTCAATGGCCATT 20 20 base pairs nucleic acid single linear cDNA NO NO 784TCTTCTGTCA ATGGCCATTG 20 20 base pairs nucleic acid single linear cDNANO NO 785 CTTCTGTCAA TGGCCATTGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 786 TTCTGTCAAT GGCCATTGTT 20 20 base pairs nucleicacid single linear cDNA NO NO 787 TCTGTCAATG GCCATTGTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 788 CTGTCAATGG CCATTGTTTA 20 20base pairs nucleic acid single linear cDNA NO NO 789 TGTCAATGGCCATTGTTTAA 20 20 base pairs nucleic acid single linear cDNA NO NO 790GTCAATGGCC ATTGTTTAAC 20 20 base pairs nucleic acid single linear cDNANO NO 791 TCAATGGCCA TTGTTTAACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 792 CAATGGCCAT TGTTTAACTT 20 20 base pairs nucleicacid single linear cDNA NO NO 793 AATGGCCATT GTTTAACTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 794 ATGGCCATTG TTTAACTTTT 20 20base pairs nucleic acid single linear cDNA NO NO 795 TGGCCATTGTTTAACTTTTG 20 20 base pairs nucleic acid single linear cDNA NO NO 796GGCCATTGTT TAACTTTTGG 20 20 base pairs nucleic acid single linear cDNANO NO 797 GCCATTGTTT AACTTTTGGG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 798 CCATTGTTTA ACTTTTGGGC 20 20 base pairs nucleicacid single linear cDNA NO NO 799 CATTGTTTAA CTTTTGGGCC 20 20 base pairsnucleic acid single linear cDNA NO NO 800 ATTGTTTAAC TTTTGGGCCA 20 20base pairs nucleic acid single linear cDNA NO NO 801 TTGTTTAACTTTTGGGCCAT 20 20 base pairs nucleic acid single linear cDNA NO NO 802TGTTTAACTT TTGGGCCATC 20 20 base pairs nucleic acid single linear cDNANO NO 803 GTTTAACTTT TGGGCCATCC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 804 TTTAACTTTT GGGCCATCCA 20 20 base pairs nucleicacid single linear cDNA NO NO 805 TTAACTTTTG GGCCATCCAT 20 20 base pairsnucleic acid single linear cDNA NO NO 806 TAACTTTTGG GCCATCCATT 20 20base pairs nucleic acid single linear cDNA NO NO 807 AACTTTTGGGCCATCCATTC 20 20 base pairs nucleic acid single linear cDNA NO NO 808ACTTTTGGGC CATCCATTCC 20 20 base pairs nucleic acid single linear cDNANO NO 809 CTTTTGGGCC ATCCATTCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 810 TTTTGGGCCA TCCATTCCTG 20 20 base pairs nucleicacid single linear cDNA NO NO 811 TTTGGGCCAT CCATTCCTGG 20 20 base pairsnucleic acid single linear cDNA NO NO 812 TTGGGCCATC CATTCCTGGC 20 20base pairs nucleic acid single linear cDNA NO NO 813 TGGGCCATCCATTCCTGGCT 20 20 base pairs nucleic acid single linear cDNA NO NO 814GGGCCATCCA TTCCTGGCTT 20 20 base pairs nucleic acid single linear cDNANO NO 815 GGCCATCCAT TCCTGGCTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 816 GCCATCCATT CCTGGCTTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 817 CCATCCATTC CTGGCTTTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 818 CATCCATTCC TGGCTTTAAT 20 20base pairs nucleic acid single linear cDNA NO NO 819 ATCCATTCCTGGCTTTAATT 20 20 base pairs nucleic acid single linear cDNA NO NO 820TCCATTCCTG GCTTTAATTT 20 20 base pairs nucleic acid single linear cDNANO NO 821 CCATTCCTGG CTTTAATTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 822 CATTCCTGGC TTTAATTTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 823 ATTCCTGGCT TTAATTTTAC 20 20 base pairsnucleic acid single linear cDNA NO NO 824 TTCCTGGCTT TAATTTTACT 20 20base pairs nucleic acid single linear cDNA NO NO 825 TCCTGGCTTTAATTTTACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 826CCTGGCTTTA ATTTTACTGG 20 20 base pairs nucleic acid single linear cDNANO NO 827 CTGGCTTTAA TTTTACTGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 828 TGGCTTTAAT TTTACTGGTA 20 20 base pairs nucleicacid single linear cDNA NO NO 829 GGCTTTAATT TTACTGGTAC 20 20 base pairsnucleic acid single linear cDNA NO NO 830 GCTTTAATTT TACTGGTACA 20 20base pairs nucleic acid single linear cDNA NO NO 831 CTTTAATTTTACTGGTACAG 20 20 base pairs nucleic acid single linear cDNA NO NO 832TTTAATTTTA CTGGTACAGT 20 20 base pairs nucleic acid single linear cDNANO NO 833 TTAATTTTAC TGGTACAGTC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 834 TAATTTTACT GGTACAGTCT 20 20 base pairs nucleicacid single linear cDNA NO NO 835 AATTTTACTG GTACAGTCTC 20 20 base pairsnucleic acid single linear cDNA NO NO 836 ATTTTACTGG TACAGTCTCA 20 20base pairs nucleic acid single linear cDNA NO NO 837 TTTTACTGGTACAGTCTCAA 20 20 base pairs nucleic acid single linear cDNA NO NO 838TTTACTGGTA CAGTCTCAAT 20 20 base pairs nucleic acid single linear cDNANO NO 839 TTACTGGTAC AGTCTCAATA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 840 TACTGGTACA GTCTCAATAG 20 20 base pairs nucleicacid single linear cDNA NO NO 841 ACTGGTACAG TCTCAATAGG 20 20 base pairsnucleic acid single linear cDNA NO NO 842 CTGGTACAGT CTCAATAGGG 20 20base pairs nucleic acid single linear cDNA NO NO 843 TGGTACAGTCTCAATAGGGC 20 20 base pairs nucleic acid single linear cDNA NO NO 844GGTACAGTCT CAATAGGGCT 20 20 base pairs nucleic acid single linear cDNANO NO 845 GTACAGTCTC AATAGGGCTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 846 TACAGTCTCA ATAGGGCTAA 20 20 base pairs nucleicacid single linear cDNA NO NO 847 ACAGTCTCAA TAGGGCTAAT 20 20 base pairsnucleic acid single linear cDNA NO NO 848 CAGTCTCAAT AGGGCTAATG 20 20base pairs nucleic acid single linear cDNA NO NO 849 AGTCTCAATAGGGCTAATGG 20 20 base pairs nucleic acid single linear cDNA NO NO 850GTCTCAATAG GGCTAATGGG 20 20 base pairs nucleic acid single linear cDNANO NO 851 TCTCAATAGG GCTAATGGGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 852 CTCAATAGGG CTAATGGGAA 20 20 base pairs nucleicacid single linear cDNA NO NO 853 TCAATAGGGC TAATGGGAAA 20 20 base pairsnucleic acid single linear cDNA NO NO 854 CAATAGGGCT AATGGGAAAA 20 20base pairs nucleic acid single linear cDNA NO NO 855 AATAGGGCTAATGGGAAAAT 20 20 base pairs nucleic acid single linear cDNA NO NO 856ATAGGGCTAA TGGGAAAATT 20 20 base pairs nucleic acid single linear cDNANO NO 857 TAGGGCTAAT GGGAAAATTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 858 AGGGCTAATG GGAAAATTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 859 GGGCTAATGG GAAAATTTAA 20 20 base pairsnucleic acid single linear cDNA NO NO 860 GGCTAATGGG AAAATTTAAA 20 20base pairs nucleic acid single linear cDNA NO NO 861 GCTAATGGGAAAATTTAAAG 20 20 base pairs nucleic acid single linear cDNA NO NO 862CTAATGGGAA AATTTAAAGT 20 20 base pairs nucleic acid single linear cDNANO NO 863 TAATGGGAAA ATTTAAAGTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 864 AATGGGAAAA TTTAAAGTGC 20 20 base pairs nucleicacid single linear cDNA NO NO 865 ATGGGAAAAT TTAAAGTGCA 20 20 base pairsnucleic acid single linear cDNA NO NO 866 TGGGAAAATT TAAAGTGCAA 20 20base pairs nucleic acid single linear cDNA NO NO 867 GGGAAAATTTAAAGTGCAAC 20 20 base pairs nucleic acid single linear cDNA NO NO 868GGAAAATTTA AAGTGCAACC 20 20 base pairs nucleic acid single linear cDNANO NO 869 GAAAATTTAA AGTGCAACCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 870 AAAATTTAAA GTGCAACCAA 20 20 base pairs nucleicacid single linear cDNA NO NO 871 AAATTTAAAG TGCAACCAAT 20 20 base pairsnucleic acid single linear cDNA NO NO 872 AATTTAAAGT GCAACCAATC 20 20base pairs nucleic acid single linear cDNA NO NO 873 ATTTAAAGTGCAACCAATCT 20 20 base pairs nucleic acid single linear cDNA NO NO 874TTTAAAGTGC AACCAATCTG 20 20 base pairs nucleic acid single linear cDNANO NO 875 TTAAAGTGCA ACCAATCTGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 876 TAAAGTGCAA CCAATCTGAG 20 20 base pairs nucleicacid single linear cDNA NO NO 877 AAAGTGCAAC CAATCTGAGT 20 20 base pairsnucleic acid single linear cDNA NO NO 878 AAGTGCAACC AATCTGAGTC 20 20base pairs nucleic acid single linear cDNA NO NO 879 AGTGCAACCAATCTGAGTCA 20 20 base pairs nucleic acid single linear cDNA NO NO 880GTGCAACCAA TCTGAGTCAA 20 20 base pairs nucleic acid single linear cDNANO NO 881 TGCAACCAAT CTGAGTCAAC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 882 GCAACCAATC TGAGTCAACA 20 20 base pairs nucleicacid single linear cDNA NO NO 883 CAACCAATCT GAGTCAACAG 20 20 base pairsnucleic acid single linear cDNA NO NO 884 AACCAATCTG AGTCAACAGA 20 20base pairs nucleic acid single linear cDNA NO NO 885 ACCAATCTGAGTCAACAGAT 20 20 base pairs nucleic acid single linear cDNA NO NO 886CCAATCTGAG TCAACAGATT 20 20 base pairs nucleic acid single linear cDNANO NO 887 CAATCTGAGT CAACAGATTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 888 AATCTGAGTC AACAGATTTC 20 20 base pairs nucleicacid single linear cDNA NO NO 889 ATCTGAGTCA ACAGATTTCT 20 20 base pairsnucleic acid single linear cDNA NO NO 890 TCTGAGTCAA CAGATTTCTT 20 20base pairs nucleic acid single linear cDNA NO NO 891 CTGAGTCAACAGATTTCTTC 20 20 base pairs nucleic acid single linear cDNA NO NO 892TGAGTCAACA GATTTCTTCC 20 20 base pairs nucleic acid single linear cDNANO NO 893 GAGTCAACAG ATTTCTTCCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 894 AGTCAACAGA TTTCTTCCAA 20 20 base pairs nucleicacid single linear cDNA NO NO 895 GTCAACAGAT TTCTTCCAAT 20 20 base pairsnucleic acid single linear cDNA NO NO 896 TCAACAGATT TCTTCCAATT 20 20base pairs nucleic acid single linear cDNA NO NO 897 CAACAGATTTCTTCCAATTA 20 20 base pairs nucleic acid single linear cDNA NO NO 898AACAGATTTC TTCCAATTAT 20 20 base pairs nucleic acid single linear cDNANO NO 899 ACAGATTTCT TCCAATTATG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 900 CAGATTTCTT CCAATTATGT 20 20 base pairs nucleicacid single linear cDNA NO NO 901 AGATTTCTTC CAATTATGTT 20 20 base pairsnucleic acid single linear cDNA NO NO 902 GATTTCTTCC AATTATGTTG 20 20base pairs nucleic acid single linear cDNA NO NO 903 ATTTCTTCCAATTATGTTGA 20 20 base pairs nucleic acid single linear cDNA NO NO 904TTTCTTCCAA TTATGTTGAC 20 20 base pairs nucleic acid single linear cDNANO NO 905 TTCTTCCAAT TATGTTGACA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 906 TCTTCCAATT ATGTTGACAG 20 20 base pairs nucleicacid single linear cDNA NO NO 907 CTTCCAATTA TGTTGACAGG 20 20 base pairsnucleic acid single linear cDNA NO NO 908 TTCCAATTAT GTTGACAGGT 20 20base pairs nucleic acid single linear cDNA NO NO 909 TCCAATTATGTTGACAGGTG 20 20 base pairs nucleic acid single linear cDNA NO NO 910CCAATTATGT TGACAGGTGT 20 20 base pairs nucleic acid single linear cDNANO NO 911 CAATTATGTT GACAGGTGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 912 AATTATGTTG ACAGGTGTAG 20 20 base pairs nucleicacid single linear cDNA NO NO 913 ATTATGTTGA CAGGTGTAGG 20 20 base pairsnucleic acid single linear cDNA NO NO 914 TTATGTTGAC AGGTGTAGGT 20 20base pairs nucleic acid single linear cDNA NO NO 915 TATGTTGACAGGTGTAGGTC 20 20 base pairs nucleic acid single linear cDNA NO NO 916ATGTTGACAG GTGTAGGTCC 20 20 base pairs nucleic acid single linear cDNANO NO 917 TGTTGACAGG TGTAGGTCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 918 GTTGACAGGT GTAGGTCCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 919 TTGACAGGTG TAGGTCCTAC 20 20 base pairsnucleic acid single linear cDNA NO NO 920 TGACAGGTGT AGGTCCTACT 20 20base pairs nucleic acid single linear cDNA NO NO 921 GACAGGTGTAGGTCCTACTA 20 20 base pairs nucleic acid single linear cDNA NO NO 922ACAGGTGTAG GTCCTACTAA 20 20 base pairs nucleic acid single linear cDNANO NO 923 CAGGTGTAGG TCCTACTAAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 924 AGGTGTAGGT CCTACTAATA 20 20 base pairs nucleicacid single linear cDNA NO NO 925 GGTGTAGGTC CTACTAATAC 20 20 base pairsnucleic acid single linear cDNA NO NO 926 GTGTAGGTCC TACTAATACT 20 20base pairs nucleic acid single linear cDNA NO NO 927 TGTAGGTCCTACTAATACTG 20 20 base pairs nucleic acid single linear cDNA NO NO 928GTAGGTCCTA CTAATACTGT 20 20 base pairs nucleic acid single linear cDNANO NO 929 TAGGTCCTAC TAATACTGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 930 AGGTCCTACT AATACTGTAC 20 20 base pairs nucleicacid single linear cDNA NO NO 931 GGTCCTACTA ATACTGTACC 20 20 base pairsnucleic acid single linear cDNA NO NO 932 GTCCTACTAA TACTGTACCT 20 20base pairs nucleic acid single linear cDNA NO NO 933 TCCTACTAATACTGTACCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 934CCTACTAATA CTGTACCTAT 20 20 base pairs nucleic acid single linear cDNANO NO 935 CTACTAATAC TGTACCTATA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 936 TACTAATACT GTACCTATAG 20 20 base pairs nucleicacid single linear cDNA NO NO 937 ACTAATACTG TACCTATAGC 20 20 base pairsnucleic acid single linear cDNA NO NO 938 CTAATACTGT ACCTATAGCT 20 20base pairs nucleic acid single linear cDNA NO NO 939 TAATACTGTACCTATAGCTT 20 20 base pairs nucleic acid single linear cDNA NO NO 940AATACTGTAC CTATAGCTTT 20 20 base pairs nucleic acid single linear cDNANO NO 941 ATACTGTACC TATAGCTTTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 942 TACTGTACCT ATAGCTTTAT 20 20 base pairs nucleicacid single linear cDNA NO NO 943 ACTGTACCTA TAGCTTTATG 20 20 base pairsnucleic acid single linear cDNA NO NO 944 CTGTACCTAT AGCTTTATGT 20 20base pairs nucleic acid single linear cDNA NO NO 945 TGTACCTATAGCTTTATGTC 20 20 base pairs nucleic acid single linear cDNA NO NO 946GTACCTATAG CTTTATGTCC 20 20 base pairs nucleic acid single linear cDNANO NO 947 TACCTATAGC TTTATGTCCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 948 ACCTATAGCT TTATGTCCAC 20 20 base pairs nucleicacid single linear cDNA NO NO 949 CCTATAGCTT TATGTCCACA 20 20 base pairsnucleic acid single linear cDNA NO NO 950 CTATAGCTTT ATGTCCACAG 20 20base pairs nucleic acid single linear cDNA NO NO 951 TATAGCTTTATGTCCACAGA 20 20 base pairs nucleic acid single linear cDNA NO NO 952ATAGCTTTAT GTCCACAGAT 20 20 base pairs nucleic acid single linear cDNANO NO 953 TAGCTTTATG TCCACAGATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 954 AGCTTTATGT CCACAGATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 955 GCTTTATGTC CACAGATTTC 20 20 base pairsnucleic acid single linear cDNA NO NO 956 CTTTATGTCC ACAGATTTCT 20 20base pairs nucleic acid single linear cDNA NO NO 957 TTTATGTCCACAGATTTCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 958TTATGTCCAC AGATTTCTAT 20 20 base pairs nucleic acid single linear cDNANO NO 959 TATGTCCACA GATTTCTATG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 960 ATGTCCACAG ATTTCTATGA 20 20 base pairs nucleicacid single linear cDNA NO NO 961 TGTCCACAGA TTTCTATGAG 20 20 base pairsnucleic acid single linear cDNA NO NO 962 GTCCACAGAT TTCTATGAGT 20 20base pairs nucleic acid single linear cDNA NO NO 963 TCCACAGATTTCTATGAGTA 20 20 base pairs nucleic acid single linear cDNA NO NO 964CCACAGATTT CTATGAGTAT 20 20 base pairs nucleic acid single linear cDNANO NO 965 CACAGATTTC TATGAGTATC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 966 ACAGATTTCT ATGAGTATCT 20 20 base pairs nucleicacid single linear cDNA NO NO 967 CAGATTTCTA TGAGTATCTG 20 20 base pairsnucleic acid single linear cDNA NO NO 968 AGATTTCTAT GAGTATCTGA 20 20base pairs nucleic acid single linear cDNA NO NO 969 GATTTCTATGAGTATCTGAT 20 20 base pairs nucleic acid single linear cDNA NO NO 970ATTTCTATGA GTATCTGATC 20 20 base pairs nucleic acid single linear cDNANO NO 971 TTTCTATGAG TATCTGATCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 972 TTCTATGAGT ATCTGATCAT 20 20 base pairs nucleicacid single linear cDNA NO NO 973 TCTATGAGTA TCTGATCATA 20 20 base pairsnucleic acid single linear cDNA NO NO 974 CTATGAGTAT CTGATCATAC 20 20base pairs nucleic acid single linear cDNA NO NO 975 TATGAGTATCTGATCATACT 20 20 base pairs nucleic acid single linear cDNA NO NO 976ATGAGTATCT GATCATACTG 20 20 base pairs nucleic acid single linear cDNANO NO 977 TGAGTATCTG ATCATACTGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 978 GAGTATCTGA TCATACTGTC 20 20 base pairs nucleicacid single linear cDNA NO NO 979 AGTATCTGAT CATACTGTCT 20 20 base pairsnucleic acid single linear cDNA NO NO 980 GTATCTGATC ATACTGTCTT 20 20base pairs nucleic acid single linear cDNA NO NO 981 TATCTGATCATACTGTCTTA 20 20 base pairs nucleic acid single linear cDNA NO NO 982ATCTGATCAT ACTGTCTTAC 20 20 base pairs nucleic acid single linear cDNANO NO 983 TCTGATCATA CTGTCTTACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 984 CTGATCATAC TGTCTTACTT 20 20 base pairs nucleicacid single linear cDNA NO NO 985 TGATCATACT GTCTTACTTT 20 20 base pairsnucleic acid single linear cDNA NO NO 986 GATCATACTG TCTTACTTTG 20 20base pairs nucleic acid single linear cDNA NO NO 987 ATCATACTGTCTTACTTTGA 20 20 base pairs nucleic acid single linear cDNA NO NO 988TCATACTGTC TTACTTTGAT 20 20 base pairs nucleic acid single linear cDNANO NO 989 CATACTGTCT TACTTTGATA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 990 ATACTGTCTT ACTTTGATAA 20 20 base pairs nucleicacid single linear cDNA NO NO 991 TACTGTCTTA CTTTGATAAA 20 20 base pairsnucleic acid single linear cDNA NO NO 992 ACTGTCTTAC TTTGATAAAA 20 20base pairs nucleic acid single linear cDNA NO NO 993 CTGTCTTACTTTGATAAAAC 20 20 base pairs nucleic acid single linear cDNA NO NO 994TGTCTTACTT TGATAAAACC 20 20 base pairs nucleic acid single linear cDNANO NO 995 GTCTTACTTT GATAAAACCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 996 TCTTACTTTG ATAAAACCTC 20 20 base pairs nucleicacid single linear cDNA NO NO 997 CTTACTTTGA TAAAACCTCC 20 20 base pairsnucleic acid single linear cDNA NO NO 998 TTACTTTGAT AAAACCTCCA 20 20base pairs nucleic acid single linear cDNA NO NO 999 TACTTTGATAAAACCTCCAA 20 20 base pairs nucleic acid single linear cDNA NO NO 1000ACTTTGATAA AACCTCCAAT 20 20 base pairs nucleic acid single linear cDNANO NO 1001 CTTTGATAAA ACCTCCAATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1002 TTTGATAAAA CCTCCAATTC 20 20 base pairs nucleicacid single linear cDNA NO NO 1003 TTGATAAAAC CTCCAATTCC 20 20 basepairs nucleic acid single linear cDNA NO NO 1004 TGATAAAACC TCCAATTCCC20 20 base pairs nucleic acid single linear cDNA NO NO 1005 GATAAAACCTCCAATTCCCC 20 20 base pairs nucleic acid single linear cDNA NO NO 1006ATAAAACCTC CAATTCCCCC 20 20 base pairs nucleic acid single linear cDNANO NO 1007 TAAAACCTCC AATTCCCCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1008 AAAACCTCCA ATTCCCCCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 1009 AAACCTCCAA TTCCCCCTAT 20 20 basepairs nucleic acid single linear cDNA NO NO 1010 AACCTCCAAT TCCCCCTATC20 20 base pairs nucleic acid single linear cDNA NO NO 1011 ACCTCCAATTCCCCCTATCA 20 20 base pairs nucleic acid single linear cDNA NO NO 1012CCTCCAATTC CCCCTATCAT 20 20 base pairs nucleic acid single linear cDNANO NO 1013 CTCCAATTCC CCCTATCATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1014 TCCAATTCCC CCTATCATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 1015 CCAATTCCCC CTATCATTTT 20 20 basepairs nucleic acid single linear cDNA NO NO 1016 CAATTCCCCC TATCATTTTT20 20 base pairs nucleic acid single linear cDNA NO NO 1017 AATTCCCCCTATCATTTTTG 20 20 base pairs nucleic acid single linear cDNA NO NO 1018ATTCCCCCTA TCATTTTTGG 20 20 base pairs nucleic acid single linear cDNANO NO 1019 TTCCCCCTAT CATTTTTGGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1020 TCCCCCTATC ATTTTTGGTT 20 20 base pairs nucleicacid single linear cDNA NO NO 1021 CCCCCTATCA TTTTTGGTTT 20 20 basepairs nucleic acid single linear cDNA NO NO 1022 CCCCTATCAT TTTTGGTTTC20 20 base pairs nucleic acid single linear cDNA NO NO 1023 CCCTATCATTTTTGGTTTCC 20 20 base pairs nucleic acid single linear cDNA NO NO 1024CCTATCATTT TTGGTTTCCA 20 20 base pairs nucleic acid single linear cDNANO NO 1025 CTATCATTTT TGGTTTCCAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1026 TATCATTTTT GGTTTCCATC 20 20 base pairs nucleicacid single linear cDNA NO NO 1027 ATCATTTTTG GTTTCCATCT 20 20 basepairs nucleic acid single linear cDNA NO NO 1028 TCATTTTTGG TTTCCATCTT20 20 base pairs nucleic acid single linear cDNA NO NO 1029 CATTTTTGGTTTCCATCTTC 20 20 base pairs nucleic acid single linear cDNA NO NO 1030ATTTTTGGTT TCCATCTTCC 20 20 base pairs nucleic acid single linear cDNANO NO 1031 TTTTTGGTTT CCATCTTCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1032 TTTTGGTTTC CATCTTCCTG 20 20 base pairs nucleicacid single linear cDNA NO NO 1033 TTTGGTTTCC ATCTTCCTGG 20 20 basepairs nucleic acid single linear cDNA NO NO 1034 TTGGTTTCCA TCTTCCTGGC20 20 base pairs nucleic acid single linear cDNA NO NO 1035 TGGTTTCCATCTTCCTGGCA 20 20 base pairs nucleic acid single linear cDNA NO NO 1036GGTTTCCATC TTCCTGGCAA 20 20 base pairs nucleic acid single linear cDNANO NO 1037 GTTTCCATCT TCCTGGCAAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1038 TTTCCATCTT CCTGGCAAAC 20 20 base pairs nucleicacid single linear cDNA NO NO 1039 TTCCATCTTC CTGGCAAACT 20 20 basepairs nucleic acid single linear cDNA NO NO 1040 TCCATCTTCC TGGCAAACTC20 20 base pairs nucleic acid single linear cDNA NO NO 1041 CCATCTTCCTGGCAAACTCA 20 20 base pairs nucleic acid single linear cDNA NO NO 1042CATCTTCCTG GCAAACTCAT 20 20 base pairs nucleic acid single linear cDNANO NO 1043 ATCTTCCTGG CAAACTCATT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1044 TCTTCCTGGC AAACTCATTT 20 20 base pairs nucleicacid single linear cDNA NO NO 1045 CTTCCTGGCA AACTCATTTC 20 20 basepairs nucleic acid single linear cDNA NO NO 1046 TTCCTGGCAA ACTCATTTCT20 20 base pairs nucleic acid single linear cDNA NO NO 1047 TCCTGGCAAACTCATTTCTT 20 20 base pairs nucleic acid single linear cDNA NO NO 1048CCTGGCAAAC TCATTTCTTC 20 20 base pairs nucleic acid single linear cDNANO NO 1049 CTGGCAAACT CATTTCTTCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1050 TGGCAAACTC ATTTCTTCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 1051 GGCAAACTCA TTTCTTCTAA 20 20 basepairs nucleic acid single linear cDNA NO NO 1052 GCAAACTCAT TTCTTCTAAT20 20 base pairs nucleic acid single linear cDNA NO NO 1053 CAAACTCATTTCTTCTAATA 20 20 base pairs nucleic acid single linear cDNA NO NO 1054AAACTCATTT CTTCTAATAC 20 20 base pairs nucleic acid single linear cDNANO NO 1055 AACTCATTTC TTCTAATACT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1056 ACTCATTTCT TCTAATACTG 20 20 base pairs nucleicacid single linear cDNA NO NO 1057 CTCATTTCTT CTAATACTGT 20 20 basepairs nucleic acid single linear cDNA NO NO 1058 TCATTTCTTC TAATACTGTA20 20 base pairs nucleic acid single linear cDNA NO NO 1059 CATTTCTTCTAATACTGTAT 20 20 base pairs nucleic acid single linear cDNA NO NO 1060ATTTCTTCTA ATACTGTATC 20 20 base pairs nucleic acid single linear cDNANO NO 1061 TTTCTTCTAA TACTGTATCA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1062 TTCTTCTAAT ACTGTATCAT 20 20 base pairs nucleicacid single linear cDNA NO NO 1063 TCTTCTAATA CTGTATCATC 20 20 basepairs nucleic acid single linear cDNA NO NO 1064 CTTCTAATAC TGTATCATCT20 20 base pairs nucleic acid single linear cDNA NO NO 1065 TTCTAATACTGTATCATCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 1066TCTAATACTG TATCATCTGC 20 20 base pairs nucleic acid single linear cDNANO NO 1067 CTAATACTGT ATCATCTGCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1068 TAATACTGTA TCATCTGCTC 20 20 base pairs nucleicacid single linear cDNA NO NO 1069 AATACTGTAT CATCTGCTCC 20 20 basepairs nucleic acid single linear cDNA NO NO 1070 ATACTGTATC ATCTGCTCCT20 20 base pairs nucleic acid single linear cDNA NO NO 1071 TACTGTATCATCTGCTCCTG 20 20 base pairs nucleic acid single linear cDNA NO NO 1072ACTGTATCAT CTGCTCCTGT 20 20 base pairs nucleic acid single linear cDNANO NO 1073 CTGTATCATC TGCTCCTGTA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1074 TGTATCATCT GCTCCTGTAT 20 20 base pairs nucleicacid single linear cDNA NO NO 1075 GTATCATCTG CTCCTGTATC 20 20 basepairs nucleic acid single linear cDNA NO NO 1076 TATCATCTGC TCCTGTATCT20 20 base pairs nucleic acid single linear cDNA NO NO 1077 ATCATCTGCTCCTGTATCTA 20 20 base pairs nucleic acid single linear cDNA NO NO 1078TCATCTGCTC CTGTATCTAA 20 20 base pairs nucleic acid single linear cDNANO NO 1079 CATCTGCTCC TGTATCTAAT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1080 ATCTGCTCCT GTATCTAATA 20 20 base pairs nucleicacid single linear cDNA NO NO 1081 TCTGCTCCTG TATCTAATAG 20 20 basepairs nucleic acid single linear cDNA NO NO 1082 CTGCTCCTGT ATCTAATAGA20 20 base pairs nucleic acid single linear cDNA NO NO 1083 TGCTCCTGTATCTAATAGAG 20 20 base pairs nucleic acid single linear cDNA NO NO 1084GCTCCTGTAT CTAATAGAGC 20 20 base pairs nucleic acid single linear cDNANO NO 1085 CTCCTGTATC TAATAGAGCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1086 TCCTGTATCT AATAGAGCTT 20 20 base pairs nucleicacid single linear cDNA NO NO 1087 CCTGTATCTA ATAGAGCTTC 20 20 basepairs nucleic acid single linear cDNA NO NO 1088 CTGTATCTAA TAGAGCTTCC20 20 base pairs nucleic acid single linear cDNA NO NO 1089 TGTATCTAATAGAGCTTCCT 20 20 base pairs nucleic acid single linear cDNA NO NO 1090GTATCTAATA GAGCTTCCTT 20 20 base pairs nucleic acid single linear cDNANO NO 1091 TATCTAATAG AGCTTCCTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1092 ATCTAATAGA GCTTCCTTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 1093 TCTAATAGAG CTTCCTTTAG 20 20 basepairs nucleic acid single linear cDNA NO NO 1094 CTAATAGAGC TTCCTTTAGT20 20 base pairs nucleic acid single linear cDNA NO NO 1095 TAATAGAGCTTCCTTTAGTT 20 20 base pairs nucleic acid single linear cDNA NO NO 1096AATAGAGCTT CCTTTAGTTG 20 20 base pairs nucleic acid single linear cDNANO NO 1097 ATAGAGCTTC CTTTAGTTGC 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1098 TAGAGCTTCC TTTAGTTGCC 20 20 base pairs nucleicacid single linear cDNA NO NO 1099 AGAGCTTCCT TTAGTTGCCC 20 20 basepairs nucleic acid single linear cDNA NO NO 1100 GAGCTTCCTT TAGTTGCCCC20 20 base pairs nucleic acid single linear cDNA NO NO 1101 AGCTTCCTTTAGTTGCCCCC 20 20 base pairs nucleic acid single linear cDNA NO NO 1102GCTTCCTTTA GTTGCCCCCC 20 20 base pairs nucleic acid single linear cDNANO NO 1103 CTTCCTTTAG TTGCCCCCCT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1104 TTCCTTTAGT TGCCCCCCTA 20 20 base pairs nucleicacid single linear cDNA NO NO 1105 TCCTTTAGTT GCCCCCCTAT 20 20 basepairs nucleic acid single linear cDNA NO NO 1106 CCTTTAGTTG CCCCCCTATC20 20 base pairs nucleic acid single linear cDNA NO NO 1107 CTTTAGTTGCCCCCCTATCT 20 20 base pairs nucleic acid single linear cDNA NO NO 1108TTTAGTTGCC CCCCTATCTT 20 20 base pairs nucleic acid single linear cDNANO NO 1109 TTAGTTGCCC CCCTATCTTT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1110 TAGTTGCCCC CCTATCTTTA 20 20 base pairs nucleicacid single linear cDNA NO NO 1111 AGTTGCCCCC CTATCTTTAT 20 20 basepairs nucleic acid single linear cDNA NO NO 1112 GTTGCCCCCC TATCTTTATT20 20 base pairs nucleic acid single linear cDNA NO NO 1113 TTGCCCCCCTATCTTTATTG 20 20 base pairs nucleic acid single linear cDNA NO NO 1114TGCCCCCCTA TCTTTATTGT 20 20 base pairs nucleic acid single linear cDNANO NO 1115 GCCCCCCTAT CTTTATTGTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1116 CCCCCCTATC TTTATTGTGA 20 20 base pairs nucleicacid single linear cDNA NO NO 1117 CCCCCTATCT TTATTGTGAC 20 20 basepairs nucleic acid single linear cDNA NO NO 1118 CCCCTATCTT TATTGTGACG20 20 base pairs nucleic acid single linear cDNA NO NO 1119 CCCTATCTTTATTGTGACGA 20 20 base pairs nucleic acid single linear cDNA NO NO 1120CCTATCTTTA TTGTGACGAG 20 20 base pairs nucleic acid single linear cDNANO NO 1121 CTATCTTTAT TGTGACGAGG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1122 TATCTTTATT GTGACGAGGG 20 20 base pairs nucleicacid single linear cDNA NO NO 1123 ATCTTTATTG TGACGAGGGG 20 20 basepairs nucleic acid single linear cDNA NO NO 1124 TCTTTATTGT GACGAGGGGT20 20 base pairs nucleic acid single linear cDNA NO NO 1125 CTTTATTGTGACGAGGGGTC 20 20 base pairs nucleic acid single linear cDNA NO NO 1126TTTATTGTGA CGAGGGGTCG 20 20 base pairs nucleic acid single linear cDNANO NO 1127 TTATTGTGAC GAGGGGTCGT 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1128 TATTGTGACG AGGGGTCGTT 20 20 base pairs nucleicacid single linear cDNA NO NO 1129 ATTGTGACGA GGGGTCGTTG 20 20 basepairs nucleic acid single linear cDNA NO NO 1130 TTGTGACGAG GGGTCGTTGC20 20 base pairs nucleic acid single linear cDNA NO NO 1131 TGTGACGAGGGGTCGTTGCC 20 20 base pairs nucleic acid single linear cDNA NO NO 1132GTGACGAGGG GTCGTTGCCA 20 20 base pairs nucleic acid single linear cDNANO NO 1133 TGACGAGGGG TCGTTGCCAA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1134 GACGAGGGGT CGTTGCCAAA 20 20 base pairs nucleicacid single linear cDNA NO NO 1135 ACGAGGGGTC GTTGCCAAAG 20 20 basepairs nucleic acid single linear cDNA NO NO 1136 CGAGGGGTCG TTGCCAAAGA20 20 base pairs nucleic acid single linear cDNA NO NO 1137 GAGGGGTCGTTGCCAAAGAG 20 20 base pairs nucleic acid single linear cDNA NO NO 1138AGGGGTCGTT GCCAAAGAGT 20 20 base pairs nucleic acid single linear cDNANO NO 1139 GGGGTCGTTG CCAAAGAGTG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1140 GGGTCGTTGC CAAAGAGTGA 20 20 base pairs nucleicacid single linear cDNA NO NO 1141 GGTCGTTGCC AAAGAGTGAT 20 20 basepairs nucleic acid single linear cDNA NO NO 1142 GTCGTTGCCA AAGAGTGATC20 20 base pairs nucleic acid single linear cDNA NO NO 1143 TCGTTGCCAAAGAGTGATCT 20 20 base pairs nucleic acid single linear cDNA NO NO 1144CGTTGCCAAA GAGTGATCTG 20 20 base pairs nucleic acid single linear cDNANO NO 1145 GTTGCCAAAG AGTGATCTGA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1146 TTGCCAAAGA GTGATCTGAG 20 20 base pairs nucleicacid single linear cDNA NO NO 1147 TGCCAAAGAG TGATCTGAGG 20 20 basepairs nucleic acid single linear cDNA NO NO 1148 GCCAAAGAGT GATCTGAGGG20 20 base pairs nucleic acid single linear cDNA NO NO 1149 CCAAAGAGTGATCTGAGGGA 20 20 base pairs nucleic acid single linear cDNA NO NO 1150CAAAGAGTGA TCTGAGGGAA 20 20 base pairs nucleic acid single linear cDNANO NO 1151 AAAGAGTGAT CTGAGGGAAG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1152 AAGAGTGATC TGAGGGAAGT 20 20 base pairs nucleicacid single linear cDNA NO NO 1153 AGAGTGATCT GAGGGAAGTT 20 20 basepairs nucleic acid single linear cDNA NO NO 1154 GAGTGATCTG AGGGAAGTTA20 20 base pairs nucleic acid single linear cDNA NO NO 1155 AGTGATCTGAGGGAAGTTAA 20 20 base pairs nucleic acid single linear cDNA NO NO 1156GTGATCTGAG GGAAGTTAAA 20 20 base pairs nucleic acid single linear cDNANO NO 1157 TGATCTGAGG GAAGTTAAAG 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1158 GATCTGAGGG AAGTTAAAGG 20 20 base pairs nucleicacid single linear cDNA NO NO 1159 ATCTGAGGGA AGTTAAAGGA 20 20 basepairs nucleic acid single linear cDNA NO NO 1160 TCTGAGGGAA GTTAAAGGAT20 20 base pairs nucleic acid single linear cDNA NO NO 1161 CTGAGGGAAGTTAAAGGATA 20 20 base pairs nucleic acid single linear cDNA NO NO 1162TGAGGGAAGT TAAAGGATAC 20 20 base pairs nucleic acid single linear cDNANO NO 1163 GAGGGAAGTT AAAGGATACA 20 20 base pairs nucleic acid singlelinear cDNA NO NO 1164 AGGGAAGTTA AAGGATACAG 20 20 base pairs nucleicacid single linear cDNA NO NO 1165 GGGAAGTTAA AGGATACAGT 20

What is claimed is:
 1. A method for predicting the potential of anoligonucleotide to hybridize to a target nucleotide sequence, saidmethod comprising: (a) identifying a predetermined number of uniqueoligonucleotides within a nucleotide sequence that is hybridizable withsaid target nucleotide sequence, said oligonucleotides being chosen tosample the entire length of said nucleotide sequence, (b) determiningand evaluating for each of said oligonucleotides at least one parameterthat is independently predictive of the ability of each of saidoligonucleotides to hybridize to said target nucleotide sequence, (c)identifying a subset of oligonucleotides within said predeterminednumber of unique oligonucleotides based on an examination of saidparameter, and (d) identifying oligonucleotides in said subset that areclustered along a region of said nucleotide sequence that ishybridizable to said target nucleotide sequence.
 2. A method accordingto claim 1 which comprises ranking said oligonucleotides of step (d)based on the size of said clusters of oligonucleotides.
 3. A methodaccording to claim 1 wherein said unique oligonucleotides are ofidentical length N.
 4. A method according to claim 3 wherein said uniqueoligonucleotides are spaced one nucleotide apart, said predeterminednumber comprising L−N+1 oligonucleotides, where L is the length of thehybridizable sequence.
 5. A method according to claim 1 wherein saidparameter is selected from the group consisting of composition factors,thermodynamic factors, chemosynthetic efficiencies and kinetic factors.6. A method according to claim 1 wherein said parameter is a compositionfactor selected from the group consisting of mole fraction (G+C),percent (G+C), sequence complexity, and sequence information content. 7.A method according to claim 1 wherein said parameter is a thermodynamicfactor selected from the group consisting of predicted duplex meltingtemperature, predicted enthalpy of duplex formation, predicted entropyof duplex formation, predicted free energy of duplex formation,predicted melting temperature of the most stable intramolecularstructure of the oligonucleotide or its complement, predicted enthalpyof the most stable intramolecular structure of the oligonucleotide orits complement, predicted entropy of the most stable intramolecularstructure of the oligonucleotide or its complement, predicted freeenergy of the most stable intramolecular structure of theoligonucleotide or its complement, predicted melting temperature of themost stable hairpin structure of the oligonucleotide or its complement,predicted enthalpy of the most stable hairpin structure of theoligonucleotide or its complement, predicted entropy of the most stablehairpin structure of the oligonucleotide or its complement, predictedfree energy of the most stable hairpin structure of the oligonucleotideor its complement, thermodynamic partition function for intramolecularstructure of the oligonucleotide or its complement.
 8. A methodaccording to claim 1 wherein said parameter is a chemosyntheticefficiency selected from the group consisting of coupling efficienciesand overall efficiency of the synthesis of a target nucleotide sequenceor an oligonucleotide probe.
 9. A method according to claim 1 whereinsaid parameter is a kinetic factor selected from the group consisting ofsteric factors calculated via molecular modeling, rate constantscalculated via molecular dynamics simulations, rate constants calculatedvia semi-empirical kinetic modeling, associative rate constants,dissociative rate constants, enthalpies of activation, entropies ofactivation, and free energies of activation.
 10. A method according toclaim 1 wherein said parameter is derived from a factor by mathematicaltransformation of said factor.
 11. A method according to claim 1 whichcomprises ranking said clustered oligonucleotides of step (d) based onthe size of said clusters of oligonucleotides and selecting a subset ofsaid clustered oligonucleotides.
 12. A method according to claim 11wherein said subset consists of any number of oligonucleotides withinsaid cluster of oligonucleotides.
 13. A method according to claim 11wherein the subset of said clustered oligonucleotides are selected tostatistically sample the cluster.
 14. A method according to claim 13wherein said statistical sample consists of oligonucleotides spaced atthe first quartile, median and third quartile of the cluster ofoligonucleotides.
 15. A method according to claim 1 wherein saidparameters are determined for said oligonucleotides by means of acomputer program.
 16. A method according to claim 1 wherein saidoligonucleotides are attached to a surface.
 17. A method according toclaim 1 wherein said oligonucleotides are DNA.
 18. A method according toclaim 1 wherein said oligonucleotides are RNA.
 19. A method according toclaim 1 wherein said oligonucleotides contain chemically modifiednucleotides.
 20. A method according to claim 1 wherein said targetnucleotide sequence is RNA.
 21. A method according to claim 1 whereinsaid target nucleotide sequence is DNA.
 22. A method according to claim1 wherein said target nucleotide sequence contains chemically modifiednucleotides.
 23. A method according to claim 1 wherein said parameteris, for each oligonucleotide/target nucleotide sequence duplex, thedifference between the predicted duplex melting temperature correctedfor salt concentration and the temperature of hybridization of each ofsaid oligonucleotides with said target nucleotide sequence.
 24. A methodaccording to claim 1 wherein step (c) comprises identifying a subset ofoligonucleotides within said predetermined number of uniqueoligonucleotides by establishing cut-off values for said parameter. 25.A method according to claim 1 wherein said step (c) comprisesidentifying a subset of oligonucleotides within said predeterminednumber of unique oligonucleotides by converting the values of saidparameter into a dimensionless number.
 26. A method according to claim25 wherein said value is converted into a dimensionless number bydetermining a dimensionless score for each parameter resulting in adistribution of scores having a mean value of zero and a standarddeviation of one.
 27. A method according to claim 26 which comprisesoptimizing a method according to calculation for said parameter based onsaid individual scores.
 28. A method according to claim 1 wherein step(b) comprises determining at least two parameters wherein saidparameters are poorly correlated with respect to one another.
 29. Amethod according to claim 28 wherein said parameters are derived from acombination of factors by mathematical transformation of those factors.30. A method according to claim 1 wherein step (b) comprises determiningtwo parameters at least one of said parameters being the associationfree energy between a subsequence within each of said oligonucleotidesand its complementary sequence on said target nucleotide sequence.
 31. Amethod according to claim 30 wherein said subsequence is 3 to 9nucleotides in length.
 32. A method according to claim 30 wherein saidsubsequence is 5 to 7 nucleotides in length.
 33. A method according toclaim 30 wherein said subsequence is at least three nucleotides from theterminus of said oligonucleotides.
 34. A method according to claim 30wherein said subsequence is at least three nucleotides from a surface towhich said oligonucleotides are attached.
 35. A method according toclaim 30 wherein said oligonucleotides are attached to a surface andsaid subsequence is at least five nucleotides from the terminus of saidoligonucleotides that is attached to said surface and at least threenucleotides from the free end of said oligonucleotides.
 36. A methodaccording to claim 30 wherein th e association free energy of themembers of a set of subsequences within each of said oligonucleotides isdetermined and said subsequence having the minimum valu e is identified.37. A method according to claim 1 which comprises includingoligonucleotides that are adjacent to said oligonucleotides in saidsubset that are clustered along a region of said target nucleotidesequence.
 38. A method according to claim 1 which comprises (i)identifying a subset of oligonucleotides within said predeterminednumber of unique oligonucleotides by establishing cut-off values foreach of said parameters.
 39. A method according to claim 1 whichcomprises determining the sizes of said clusters of step (d) by countingthe number of contiguous oligonucleotides in said region of saidhybridizable sequence.
 40. A method according to claim 1 which comprisesdetermining the sizes of said clusters of step (d) by counting thenumber of oligonucleotides in said subset that begin in a region ofpredetermined length in said hybridizable sequence.
 41. A method forpredicting the potential of an oligonucleotide to hybridize to acomplementary target nucleotide sequence, said method comprising: (a)identifying a set of overlapping oligonucleotides from a nucleotidesequence that is complementary to said target nucleotide sequence, (b)determining and evaluating for each of said oligonucleotides at leasttwo parameters that are independently predictive of the ability of eachof said oligonucleotides to hybridize to said target nucleotide sequencewherein said parameters are poorly correlated with respect to oneanother, (c) identifying a subset of oligonucleotides within said set ofoligonucleotides based on an examination of said parameters, and (d)identifying oligonucleotides in said subset that are clustered along aregion of said complementary nucleotide sequence.
 42. A method accordingto claim 41 which comprises ranking said oligonucleotides of step (d)based on the size of said clusters of oligonucleotides.
 43. A methodaccording to claim 41 which comprises determining the sizes of saidclusters of step (d) by counting the number of contiguousoligonucleotides in said region of said complementary sequence.
 44. Amethod according to claim 41 which comprises determining the sizes ofsaid clusters of step (d) by counting the number of oligonucleotides insaid subset that begin in a region of set length in said complementarysequence.
 45. A method according to claim 41 wherein said overlappingoligonucleotides are of identical length N.
 46. A method according toclaim 45 wherein said overlapping oligonucleotides are spaced onenucleotide apart, said set comprising L−N+1 oligonucleotides, where L isthe length of the complementary sequence.
 47. A method according toclaim 41 wherein said parameters are each independently selected fromthe group consisting of composition factors, thermodynamic factors,chemosynthetic efficiencies and kinetic factors.
 48. A method accordingto claim 41 wherein said parameters are composition factors selectedfrom the group consisting of mole fraction (G+C), percent (G+C),sequence complexity, and sequence information content.
 49. A methodaccording to claim 41 wherein said parameters are thermodynamic factorsselected from the group consisting of predicted duplex meltingtemperature, predicted enthalpy of duplex formation, predicted entropyof duplex formation, predicted free energy of duplex formation,predicted melting temperature of the most stable intramolecularstructure of the oligonucleotide or its complement, predicted enthalpyof the most stable intramolecular structure of the oligonucleotide orits complement, predicted entropy of the most stable intramolecularstructure of the oligonucleotide or its complement, predicted freeenergy of the most stable intramolecular structure of theoligonucleotide or its complement, predicted melting temperature of themost stable hairpin structure of the oligonucleotide or its complement,predicted enthalpy of the most stable hairpin structure of theoligonucleotide or its complement, predicted entropy of the most stablehairpin structure of the oligonucleotide or its complement, predictedfree energy of the most stable hairpin structure of the oligonucleotideor its complement, thermodynamic partition function for intramolecularstructure of the oligonucleotide or its complement.
 50. A methodaccording to claim 41 wherein any of said parameters is derived from afactor by mathematical transformation of said factor.
 51. A methodaccording to claim 49 wherein any of said parameters is derived from acombination of factors by mathematical transformation of those factors.52. A method according to claim 41 wherein said parameters arechemosynthetic efficiencies selected from the group consisting ofcoupling efficiencies and overall efficiencies of the syntheses of atarget nucleotide sequence or an oligonucleotide probe.
 53. A methodaccording to claim 41 wherein said parameters are kinetic factorsselected from the group consisting of steric factors calculated viamolecular modeling, rate constants calculated via molecular dynamicssimulations, rate constants calculated via semi-empirical kineticmodeling, associative rate constants, dissociative rate constants,enthalpies of activation, entropies of activation, and free energies ofactivation.
 54. A method according to claim 41 which comprises rankingsaid clustered oligonucleotides of step (d) based on the size of saidclusters of oligonucleotides and selecting a subset of said clusteredoligonucleotides.
 55. A method according to claim 54 wherein said subsetconsists of any number of oligonucleotides within said cluster ofoligonucleotides.
 56. A method according to claim 54 wherein the subsetof said clustered oligonucleotides are selected to statistically samplethe cluster.
 57. A method according to claim 54 wherein said statisticalsample consists of oligonucleotides spaced at the first quartile, medianand third quartile of the cluster of oligonucleotides.
 58. A methodaccording to claim 41 wherein said parameters are determined for saidoligonucleotides by means of a computer program.
 59. A method accordingto claim 41 wherein said oligonucleotides are attached to a surface. 60.A method according to claim 41 wherein said oligonucleotides are DNA.61. A method according to claim 41 wherein said oligonucleotides areRNA.
 62. A method according to claim 41 wherein said oligonucleotidescontain chemically modified nucleotides.
 63. A method according to claim41 wherein said target nucleotide sequence is RNA.
 64. A methodaccording to claim 41 wherein said target nucleotide sequence is DNA.65. A method according to claim 41 wherein said target nucleotidesequence contains chemically modified nucleotides.
 66. A methodaccording to claim 41 wherein said parameter is, for eacholigonucleotide/target nucleotide sequence duplex, the differencebetween the predicted duplex melting temperature corrected for saltconcentration and the temperature of hybridization of each of saidoligonucleotides with said target nucleotide sequence.
 67. A methodaccording to claim 41 wherein step (c) comprises identifying a subset ofoligonucleotides within said set of oligonucleotides by establishingcut-off values for each set of parameters.
 68. A method according toclaim 41 wherein said step (c) comprises identifying a subset ofoligonucleotides within said set of oligonucleotides by converting thevalues of said parameters into a dimensionless number.
 69. A methodaccording to claim 66 wherein said values are converted intodimensionless numbers by (a) determining a dimensionless score for eachparameter resulting in a distribution of scores having a mean value ofzero and a standard deviation of one and (b) calculating a combinationscore by evaluating a weighted average of the individual scores.
 70. Amethod according to claim 69 wherein step (b) comprises optimizing theweighting factors based on comparison of said individual scores to acalibration data set.
 71. A method according to claim 41 wherein step(b) comprises determining two parameters at least one of said parametersbeing the association free energy between a subsequence within each ofsaid oligonucleotides and its complementary sequence on said targetnucleotide sequence.
 72. A method according to claim 71 wherein saidsubsequence is 3 to 9 nucleotides in length.
 73. A method according toclaim 71 wherein said subsequence is 5 to 7 nucleotides in length.
 74. Amethod according to claim 71 wherein said subsequence is at least threenucleotides from the terminus of said oligonucleotides.
 75. A methodaccording to claim 71 wherein said oligonucleotides are attached to asurface and said subsequence is at least five nucleotides from theterminus of said oligonucleotides that is attached to said surface andat least three nucleotides from the free end of said oligonucleotides.76. A method according to claim 71 wherein the association free energyof the members of a set of subsequences within each of saidoligonucleotides is determined and said subsequence having the minimumvalue is identified.
 77. A method according to claim 41 which comprisesincluding in said evaluation oligonucleotides that are adjacent to saidoligonucleotides in said subset that are clustered along a region ofsaid target nucleotide sequence.
 78. A method for predicting thepotential of an oligonucleotide to hybridize to a complementary targetnucleotide sequence, said method comprising: (a) obtaining, from anucleotide sequence complementary to said target nucleotide sequence, aset of overlapping oligonucleotides of identical length N and spaced onenucleotide apart, said set comprising L−N+1 oligonucleotides, (b)determining and evaluating for each of said oligonucleotides theparameters: (i) the predicted melt temperature of the duplex of saidoligonucleotide and said target nucleotide sequence corrected for saltconcentration and (ii) predicted free energy of the most stableintramolecular structure of the oligonucleotide at the temperature ofhybridization of each of said oligonucleotides with said targetnucleotide sequence, (c) identifying a subset of oligonucleotides withinsaid set of oligonucleotides based on an examination of said parametersby establishing cut-off values for each of said parameters, (d) rankingoligonucleotides in said subset that are clustered along a region ofsaid complementary nucleotide sequence based on the size of saidclusters of oligonucleotides, and (e) selecting a subset of saidclustered oligonucleotides.
 79. A method according to claim 78 whereinsaid subset consists of any number of oligonucleotides within saidcluster of oligonucleotides.
 80. A method according to claim 78 whereinthe subset of said clustered oligonucleotides are selected tostatistically sample the cluster.
 81. A method according to claim 78wherein said parameters are derived by mathematical transformation ofthe factors named in claim 76(b).
 82. A method according to claim 78wherein the melting temperature of step (b) is transformed bysubtracting the temperature of hybridization.
 83. A method according toclaim 78 which comprises determining the sizes of said clusters of step(d) by counting the number of contiguous oligonucleotides in said regionof said complementary sequence.
 84. A method according to claim 78wherein said statistical sample consists of oligonucleotides spaced atthe first quartile, median and third quartile of the cluster ofoligonucleotides.
 85. A method according to claim 78 wherein saidparameters are determined for said oligonucleotides by means of acomputer program.
 86. A method according to claim 78 wherein saidoligonucleotides are attached to a surface.
 87. A method according toclaim 78 wherein said oligonucleotides are DNA.
 88. A method accordingto claim 78 wherein said oligonucleotides are RNA.
 89. A methodaccording to claim 78 wherein said oligonucleotides contain chemicallymodified nucleotides.
 90. A method according to claim 78 wherein saidtarget nucleotide sequence is RNA.
 91. A method according to claim 78wherein said target nucleotide sequence is DNA.
 92. A method accordingto claim 78 wherein said target nucleotide sequence contains chemicallymodified nucleotides.
 93. A method according to claim 68 wherein thefollowing equations are used for converting the values of saidparameters into a dimensionless number:${s_{i,x} = \frac{x_{i} - {\langle x\rangle}}{\sigma_{\{ x\}}}},$

where s_(i,x) is the dimensionless score derived from parameter xcalculated for oligonucleotide i, x_(i) is the value of parameter xcalculated for oligonucleotide i, <x> is the average of parameter xcalculated for all of the oligonucleotides under consideration for agiven nucleotide sequence target, and σ_({x}) is the standard deviationof parameter x calculated for all of the oligonucleotides underconsideration for a given nucleotide sequence target, and is given bythe equation${\sigma_{\{ x\}} = {\sqrt{\frac{\sum\limits_{j = 1}^{L - N + 1}\quad ( {x_{j} - {\langle x\rangle}} )^{2}}{L - N}}.}},$

where the target sequence is of length L and the oligonucleotides are oflength N.
 94. A method according to claim 68 wherein a combination scoreS_(i) is calculated by evaluating a weighted average of the individualvalues of the dimensionless scores s_(i,x) by the equation:${S_{i} = {\sum\limits_{\{ x\}}{q_{x}s_{i,x}}}},$

where q_(x) is the weight assigned to the score derived from parameterx, the individual values of q_(x) are always greater than zero, and thesum of the weights q_(x) is unity.
 95. A method according to claim 78where clustering is determined by calculating a moving window-averagedcombination score <S_(i)> for the ith probe by the equation:${{\langle S_{i}\rangle} = {\frac{1}{w}{\sum\limits_{j = {i - \frac{w - 1}{2}}}^{i + \frac{w - 1}{2}}\quad S_{j}}}},{w = {{an}\quad {odd}\quad {{integer}.}}},$

where w is the length of the window for averaging, and then applying acutoff filter to the value of <S_(i)>.
 96. A method according to claim94 wherein optimization of the weights q_(x) is performed by varying thevalues of the weights so that the correlation coefficientρ_({<Si>},{Vi}) between the set of window-averaged combination scores{<S_(i)>} and a set of calibration experimental measurements {V_(i)} ismaximized. The correlation coefficient ρ_({<Si>},{Vi}) is calculatedfrom the equation${\rho_{x,y} = \frac{{Covariance}( {x,y} )}{\sqrt{{{Variance}(x)}{{Variance}(y)}}}},$

where x=<S_(i)>, y=V_(i) and the Covariance (x,y) is defined by${{Covariance}( {x,y} )} = {\frac{1}{N}{\sum\limits_{i = 1}^{N}\quad {( {x_{i} - \mu_{x}} ){( {y_{i} - \mu_{y}} ).}}}}$

The quantities μ_(x) and μ_(y) are the averages of the quantities x andy, while the variances are the squares of the standard deviations.
 97. Amethod according to claim 95 wherein the cutoff filter selects thelowest values of the window-averaged combination score <S_(i)> and theclustered probes so identified are predicted to exhibit lowhybridization efficiency.
 98. A computer based method for predicting thepotential of an oligonucleotide to hybridize to a target nucleotidesequence, said method comprising: (a) identifying under computer controla predetermined number of unique oligonucleotides within a nucleotidesequence that is hybridizable with said target nucleotide sequence, saidoligonucleotides being chosen to sample the entire length of saidnucleotide sequence, (b) under computer control, determining andevaluating for each of said oligonucleotides a value for at least oneparameter that is independently predictive of the ability of each ofsaid oligonucleotides to hybridize to said target nucleotide sequenceand storing said parameter values, (c) identifying under computercontrol, from said stored parameter values, a subset of oligonucleotideswithin said predetermined number of unique oligonucleotides based on anexamination of said parameter, and (d) identifying under computercontrol oligonucleotides in said subset that are clustered along aregion of said nucleotide sequence that is hybridizable to said targetnucleotide sequence.
 99. A method according to claim 98 wherein theidentified subset of oligonucleotide sequences is electronicallytransferred to an oligonucleotide array manufacturing system.
 100. Acomputer system for conducting a method for predicting the potential ofan oligonucleotide to hybridize to a target nucleotide sequence, saidsystem comprising: (a) input means for introducing a target nucleotidesequence into said computer system, (b) means for determining a numberof unique oligonucleotide sequences that are within a nucleotidesequence that is hybridizable with said target nucleotide sequence, saidoligonucleotide sequences being chosen to sample the entire length ofsaid nucleotide sequence, (c) memory means for storing saidoligonucleotide sequences, (d) means for controlling said computersystem to carry out a determination and evaluation for each of saidoligonucleotide sequences a value for at least one parameter that isindependently predictive of the ability of each of said oligonucleotidesequences to hybridize to said target nucleotide sequence, (e) means forstoring said parameter values, (f) means for controlling said computerto carry out an identification from said stored parameter values asubset of oligonucleotide sequences within said number of uniqueoligonucleotide sequences based on an examination of said parameter, (g)means for storing said subset of oligonucleotides, (h) means forcontrolling said computer to carry out an identification ofoligonucleotide sequences in said subset that are clustered along aregion of said nucleotide sequence that is hybridizable to said targetnucleotide sequence. (i) means for storing said oligonucleotidesequences in said subset, and (j) means for outputting data relating tosaid oligonucleotide sequences in said subset.
 101. A computer systemaccording to claim 100 wherein the identified subset of oligonucleotidesequences is electronically transferred to an oligonucleotide arraymanufacturing system.